AnalyticDB:Use LOAD DATA to import data to Data Warehouse Edition - AnalyticDB

LOAD DATA LOCAL INFILE loads on-premises files directly into an AnalyticDB for MySQL Data Warehouse Edition table over a MySQL client connection. All text files are supported. CSV files are supported on clusters running V3.1.10.2 or later and V3.2.0 or later.

Prerequisites

Before you begin, ensure that you have:

An AnalyticDB for MySQL Data Warehouse Edition cluster
A MySQL client with the local-infile option enabled (see Enable local-infile)
A target table in the cluster

Syntax

LOAD DATA LOCAL
    INFILE 'file_name'
    [REPLACE | IGNORE]
    INTO TABLE table_name
    [{FIELDS | COLUMNS}
        [TERMINATED BY 'string']
        [[OPTIONALLY] ENCLOSED BY 'char']
    ]
    [LINES
        [TERMINATED BY 'string']
    ]
    [IGNORE number {LINES | ROWS}]
    [(column_name_or_user_var
    [, column_name_or_user_var] ...)]

Parameters

File location

Parameter	Description
`file_name`	Path to the on-premises file. Accepts an absolute path or a relative path. Relative paths are resolved relative to the client's working directory.

Conflict resolution

Parameter	Description
`REPLACE`	Overwrites existing rows when a duplicate primary key is detected. If a row fails to import, the operation stops immediately and subsequent rows are not imported.
`IGNORE`	Skips rows with duplicate primary keys or data errors and continues importing the remaining rows.

Note Data import does not guarantee atomicity. In IGNORE mode, failed rows are skipped silently. In REPLACE mode, the import stops on the first failure, so some rows may not be imported. Run SHOW WARNINGS after import to review errors for failed rows.

File format

Parameter	Default	Description
`TERMINATED BY 'string'` (FIELDS)	`\t`	Column delimiter. Same default as MySQL.
`ENCLOSED BY 'char'` (FIELDS)	—	Character wrapping each column value. For example, `ENCLOSED BY '"'` strips double quotation marks from values before import.
`TERMINATED BY 'string'` (LINES)	`\n`	Row delimiter.

Row filtering and column mapping

Parameter	Description
`IGNORE number LINES`	Skips the first n rows. Use this to skip header rows.
`(column_name_or_user_var, ...)`	Maps file columns to table columns by name. If not specified, columns are mapped in order. If the file has more columns than the table, extra columns are ignored. If the file has fewer columns, unmatched table columns receive their default values.

Enable local-infile

Before running LOAD DATA LOCAL INFILE, enable the local-infile option on your MySQL client. Add the following lines to ~/.my.cnf:

[mysqld]
local-infile

[mysql]
local-infile

For more information about MySQL option files, see the MySQL documentation.

CSV file support

On clusters running V3.1.10.2 or later and V3.2.0 or later, CSV files are supported and follow the OpenCSV standard by default. You can configure custom column delimiters, but row delimiters and column terminators cannot be customized for CSV files.

To disable OpenCSV parsing for CSV files, run:

SET adb_config LOAD_CSV_DATA_WITH_OPENCSV_FORMAT=false;

Import data

The following example imports data from a local file named out.bak into a table named test in the adb_demo database. The file contains 5,000 rows. The column delimiter is \t and the row delimiter is \n. Row 8 contains invalid data.

Step 1: Connect to the cluster and create the target table.

Connect to your AnalyticDB for MySQL cluster, then run CREATE DATABASE and CREATE TABLE to set up the destination:

CREATE TABLE test (
    a int NOT NULL DEFAULT '0',
    b varchar NOT NULL,
    PRIMARY KEY (a)
) DISTRIBUTED BY HASH(a);

Step 2: Run LOAD DATA LOCAL INFILE.

Choose a conflict resolution mode based on how you want to handle bad rows:

IGNORE mode — skips invalid rows and continues. Use this when partial imports are acceptable.

LOAD DATA LOCAL INFILE '~/out.bak'
    IGNORE INTO TABLE test
    FIELDS TERMINATED BY '\t'
    LINES TERMINATED BY '\n';

REPLACE mode — stops on the first invalid row. Use this when you need to detect data problems immediately.

LOAD DATA LOCAL INFILE '~/out.bak'
    REPLACE INTO TABLE test
    FIELDS TERMINATED BY '\t'
    LINES TERMINATED BY '\n';

Skip header rows — use IGNORE n LINES to skip the first n rows (for example, a header row).

LOAD DATA LOCAL INFILE '~/out.bak'
    REPLACE INTO TABLE test
    FIELDS TERMINATED BY '\t'
    LINES TERMINATED BY '\n'
    IGNORE 10 LINES;

Step 3: Verify the result.

After import, check the row count:

SELECT COUNT(1) FROM test;

IGNORE mode — row 8 (invalid) is skipped; 4,999 rows are imported:

+----------+
| count(1) |
+----------+
|     4999 |
+----------+
1 row in set (0.14 sec)

REPLACE mode — the import stops at row 8 (the first invalid row) and returns an error:

ERROR 1064 (42000): [13000, 2019061210070703000511314203303000266] syntax error :syntax error => IDENTIFIER is not value type pos:34 row: 0 and ceil:0

Skip 10 lines — the first 10 rows are skipped; 4,990 rows are imported:

Query OK, 4990 rows affected (0.37 sec)

Step 4: Review errors for failed rows.

Run SHOW WARNINGS to see error details for any rows that failed to import:

SHOW WARNINGS;