All Products
Search
Document Center

PolarDB:Use a program to import data

Last Updated:Apr 18, 2024

This topic describes how to write code to import data into PolarDB-X.

In this topic, a sample table is used in the example to describe the operations that are performed to import data. The following statement is executed to create the sample table:

CREATE TABLE `test1` (
    `id` int(11) NOT NULL,
    `k` int(11) NOT NULL DEFAULT '0',
    `c` char(120) NOT NULL DEFAULT '',
    `pad` char(60) NOT NULL DEFAULT '',
    PRIMARY KEY (`id`),
    KEY `k_1` (`k`)
) ENGINE = InnoDB DEFAULT CHARSET = utf8mb4 dbpartition by hash(`id`);

Export source data from a database

You can generate the source data that you want to import to PolarDB-X. You can also export the source data from a database. You can run the mysql -e command to export the source data from a database that contains the sample table. PolarDB-X and MySQL support the command. The following sample code provides an example on how to run this command:

mysql -h ip  -P port -u usr -pPassword db_name -N -e "SELECT id,k,c,pad FROM test1;" >/home/data_1000w.txt
## The source data is separated by tab characters and is organized based on the following format: 188092293    27267211    59775766593-64673028018-...-09474402685    01705051424-...-54211554755.

mysql -h ip  -P port -u usr -pPassword db_name -N -e "SELECT id,k,c,pad FROM test1;" | sed 's/\t/,/g' >/home/data_1000w.csv
## The source data is exported as a CSV file. The data in the CSV file is separated by commas (,). The source data in a CSV file is organized based on the following format: 188092293,27267211,59775766593-64673028018-...-09474402685,01705051424-...-54211554755.

We recommend that you export the data in the CSV format to help the program that is described in the following section to read the data.

Create a table in your destination PolarDB-X database

The source data does not contain the statement that is used to create a table. Therefore, you must create a table in your destination PolarDB-X database. The following sample code provides an example on how to create a table:

CREATE TABLE `test1` (
    `id` int(11) NOT NULL,
    `k` int(11) NOT NULL DEFAULT '0',
    `c` char(120) NOT NULL DEFAULT '',
    `pad` char(60) NOT NULL DEFAULT '',
    PRIMARY KEY (`id`),
    KEY `k_1` (`k`)
) ENGINE = InnoDB DEFAULT CHARSET = utf8mb4 dbpartition by hash(`id`);

Use a program to import the source data into PolarDB-X

You can write a program and use the program to connect to your destination PolarDB-X database. After the destination database is connected, use the program to read the source data and perform batch inserts to import the source data into the destination PolarDB-X database.

The following sample code provides an example on how to write a Java program:

// mysql-connector-java.jar is required. For more information, visit https://mvnrepository.com/artifact/mysql/mysql-connector-java.
// Download link: https://repo1.maven.org/maven2/mysql/mysql-connector-java/8.0.20/mysql-connector-java-8.0.20.jar.
// Note: Different versions of mysql-connector-java.jar contain Class.forName("com.mysql.cj.jdbc.Driver") class in different paths.
// Compile javac LoadData.java.
// Run java -cp .:mysql-connector-java-8.0.20.jar LoadData

import java.io.BufferedReader;
import java.io.File;
import java.io.FileReader;
import java.io.IOException;
import java.sql.Connection;
import java.sql.DriverManager;
import java.sql.PreparedStatement;
import java.sql.SQLException;

public class LoadData {
    public static void main(String[] args) throws IOException, ClassNotFoundException, SQLException {
        File dataFile = new File("/home/data_1000w.csv");
        String sql = "insert into test1(id, k, c, pad) values(?, ?, ?, ?)";
        int batchSize = 1000;
        try (
            Connection connection = getConnection("ip", 3306, "db", "usr", "password");
            BufferedReader br = new BufferedReader(new FileReader(dataFile))) {
            String line;
            PreparedStatement st = connection.prepareStatement(sql);
            long startTime = System.currentTimeMillis();
            int batchCount = 0;

            while ((line = br.readLine()) != null) {
                String[] data = line.split(",");
                st.setInt(1, Integer.valueOf(data[0]));
                st.setInt(2, Integer.valueOf(data[1]));
                st.setObject(3, data[2]);
                st.setObject(4, data[3]);

                st.addBatch();
                if (++batchCount % batchSize == 0) {
                    st.executeBatch();
                    System.out.println(String.format("insert %d records", batchCount));
                }
            }
            if (batchCount % batchSize != 0) {
                st.executeBatch();
            }
            long cost = System.currentTimeMillis() - startTime;
            System.out.println(String.format("Take %d second,insert %d records, tps %d", cost/1000, batchCount, batchCount/(cost/1000)));
        }
    }
    /**
     * Obtain the information about the connection to the destination PolarDB-X database.
     *
     * @param host     The IP address of the destination PolarDB-X database.
     * @param port    The port number of the destination PolarDB-X database.
     * @param database The name of the destination PolarDB-X database.
     * @param username The username of the account that is used to connect to the destination PolarDB-X database.
     * @param password: The password of the account that is used to connect to the destination PolarDB-X database.
     * @return
     * @throws ClassNotFoundException
     * @throws SQLException
     */
    private static Connection getConnection(String host, int port, String database, String username, String password)
        throws ClassNotFoundException, SQLException {
        Class.forName("com.mysql.cj.jdbc.Driver");
        String url = String.format(
            "jdbc:mysql://%s:%d/%s?autoReconnect=true&socketTimeout=600000&rewriteBatchedStatements=true", host, port,
            database);
        Connection con = DriverManager.getConnection(url, username, password);
        return con;
    }
}

You can write your program based on actual application scenarios. For example, you can set an appropriate batch size and specify multiple threads to enhance performance.