This topic describes how to use the LOAD DATA statement to import local data to AnalyticDB for MySQL.

Syntax

LOAD DATA LOCAL 'The statement used to import data from a local file.'
    INFILE 'file_name' 'The path and the name of the local file to be imported to the AnalyticDB for MySQL database.'
    [REPLACE | IGNORE] 'Specifies whether to overwrite existing data with the data to be imported or ignore rows that failed to be imported when a duplicate primary key is used.'
    INTO TABLE table_name 'The name of the destination table within the AnalyticDB for MySQL database.'
    [{FIELDS | COLUMNS}
        [TERMINATED BY 'string'] 'The column delimiter used to define each row. The default value is \t, which is the same as the column delimiter in MySQL databases.'
        [[OPTIONALLY] ENCLOSED BY 'char'] 'The character used to enclose the data of each column.'
    ]
    [LINES
        [TERMINATED BY 'string'] 'The row delimiter. The default value is \n.'
    ]
    [IGNORE number {LINES | ROWS}] 'The number of rows to be ignored from the beginning during data import.'
    [(column_name_or_user_var
    [, column_name_or_user_var] ...)] 'The columns to be imported. If you do not specify this parameter, data is imported in the column order.'      

Parameters

  • LOAD DATA LOCAL INFILE: imports data from a local file.
  • file_name: the path and the name of the local file to be imported to the AnalyticDB for MySQL database.
    Note If the path specified by the file_name parameter is a relative path, the path is based on the path of the client.
  • table_name: the name of the destination table within the AnalyticDB for MySQL database.
    Note You do not need to add the database name before the table name.
  • REPLACE: overwrites existing data with the data to be imported when a duplicate primary key is used.
  • IGNORE: ignores rows that failed to be imported when a duplicate primary key is used or a data error occurs. Some rows may fail to be imported.
  • [FIELDS] TERMINATED BY 'string': the column delimiter used to define each row. The default value is \t, which is the same as the column delimiter in MySQL databases.
  • [FIELDS] ENCLOSED BY 'char': the character used to enclose the data of each column.

    For example, if the data is "a" in a column and this parameter is set to enclosed by '"', the double quotation marks ("") are removed from "a" before data import.

  • [LINES] TERMINATED BY 'string': the row delimiter. The default value is \n.
  • IGNORE number LINES: The number of rows to be ignored from the beginning during data import. For example, IGNORE 1 LINES indicates that the first row is ignored and not imported to the AnalyticDB for MySQL database.
  • (column_name_or_user_var,...): the columns to be imported. If you do not specify this parameter, the data is imported in the column order.
    • If the number of the columns to be imported is greater than that of the columns in the destination table, the system ignores extra columns.
    • If the number of the columns to be imported is less than that of the columns in the destination table, the following columns that are blank are filled with default values.

Precautions

  • local-file is enabled for a client.

    For example, to enable the local-file feature for a MySQL client, you must add the following configuration to the my.cnf file:

    cat ~/.my.cnf
    [mysqld]
    local-infile
    [mysql]
    local-infile                    

    For more information about the my.cnf file, visit the MySQL official documentation.

  • The atomicity of operations cannot be guaranteed during data import.
    • The rows of data that failed to be imported are ignored in IGNORE mode.
    • In REPLACE mode, if a row of data fails to be imported, the INSERT operation is terminated and some rows of data may not be imported.
  • You can execute the SHOW WARNINGS statement to view the error messages of failed rows.

Example

In this example, the data within the local file named out.bak is imported to the table named test in AnalyticDB for MySQL. The out.bak file contains 5,000 rows of data. The column delimiter is \tand the row delimiter is \n. The tenth row of data is invalid, as shown in the following data entries:

1 bb
2 bb
3 bb
4 bb
5 bb
6 bb
7 bb
8 bb
9 bb
bb   10
...          
  1. Connect to the target AnalyticDB for MySQL cluster. Execute the CREATE DATABASE and CREATE TABLE statements to create a table named test within the database named adb_demo to store the data to be imported from the local file. For more information, see Connect to a cluster, CREATE DATABASE, and CREATE TABLE.
    CREATE TABLE test (
    a int(11) NOT NULL DEFAULT '0', 
    b varchar(8) NOT NULL, 
    PRIMARY KEY (a) 
    ) DISTRIBUTED by hash(a);                
  2. Execute the LOAD DATA statement on a MySQL client to import data from the local file named out.bak to the test table within the AnalyticDB for MySQL database.
    # Execute the LOAD DATA statement in IGNORE mode. In this mode, some rows may fail to be imported.
    mysql>  load data local infile '~/out.bak' ignore into table test FIELDS TERMINATED BY '\t' LINES TERMINATED BY '\n' ;
    Query OK, 4999 rows affected, 1 warning (0.69 sec)
    mysql> show warnings;
    +---------+-------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
    | Level   | Code  | Message                                                                                                                                                                                            |
    +---------+-------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
    | Warning | 13000 | com.alibaba.cloud.analyticdb.common.sql.hardcode.HardParserException: syntax error :syntax error => IDENTIFIER is not value type pos:33 row: 0 and ceil:0 |
    +---------+-------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
    1 row in set (0.01 sec)
    
    mysql> select count(1) from test;
    +----------+
    | count(1) |
    +----------+
    |     4999 |
    +----------+
    1 row in set (0.14 sec)
    
    # Execute the LOAD DATA statement in REPLACE mode. In this mode, if a row fails to be imported, the subsequent import operation is terminated immediately.
    mysql> load data local infile '~/out.bak' replace into table test FIELDS TERMINATED BY '\t' LINES TERMINATED BY '\n' ;
    ERROR 1064 (42000): [13000, 2019061210070703000511314203303000266] syntax error :syntax error => IDENTIFIER is not value type pos:34 row: 0 and ceil:0
    
    # Execute the LOAD DATA statement to skip the first 10 rows of data during data import.
    mysql> load data local infile '~/out.bak' replace into table test FIELDS TERMINATED BY '\t' LINES TERMINATED BY '\n' IGNORE 10 LINES;
    Query OK, 4990 rows affected (0.37 sec)
    
    mysql> show warnings;
    Empty set (0.00 sec)