Import data to tables - MaxCompute - Alibaba Cloud Documentation Center

This topic describes how to import data from data files on your computer to MaxCompute tables by running Tunnel Upload commands on the MaxCompute client.

Prerequisites

The tables to which you want to import data are created.
For more information about how to create tables, see Create tables.
The CSV or TXT data files from which you want to import data are downloaded to your computer.
The following sample files are used in this topic:
- The file whose data you want to import to a non-partitioned table: banking.txt
- The files whose data you want to import to a partitioned table: banking_nocreditcard.csv, banking_uncreditcard.csv, and banking_yescreditcard.csv

Step 1: Import data

Import data from data files on your computer to MaxCompute tables by running Tunnel Upload commands. For more information about Tunnel operations, see Tunnel commands.

Obtain the storage paths of the data files.
You can store the files in the bin directory of the MaxCompute client. In this case, you must specify a storage path in the File name.File name extension format in the import command. You can also store the files in another directory, such as the test folder on drive D. In this case, you must specify a storage path in the D:\test\File name.File name extension format in the import command.
In this example, banking.txt is stored in the bin directory of the MaxCompute client, and banking_yescreditcard.csv, banking_uncreditcard.csv, and banking_nocreditcard.csv are stored in the test folder on drive D.

On the MaxCompute client, run the following Tunnel Upload commands to import data:

tunnel upload banking.txt bank_data;
tunnel upload D:\test\banking_yescreditcard.csv bank_data_pt/credit="yes";
tunnel upload D:\test\banking_uncreditcard.csv bank_data_pt/credit="unknown";
tunnel upload D:\test\banking_nocreditcard.csv bank_data_pt/credit="no";

When OK is returned, the data is imported.

Step 2: Check the import results

After you import data to a table or partition, you must check whether the number of data records in the table or partition is consistent with that in the data file. If the numbers are inconsistent, the data is not completely imported.

In the example, banking.txt contains 41,188 records, banking_yescreditcard.csv contains 3 records, banking_uncreditcard.csv contains 8,597 records, and banking_nocreditcard.csv contains 32,588 records. Run the following commands:

select count(*) as num1 from bank_data;
select count(*) as num2 from bank_data_pt where credit="yes";
select count(*) as num3 from bank_data_pt where credit="unknown";
select count(*) as num4 from bank_data_pt where credit="no";

The following information is returned:

-- The number of data records in bank_data. 
+------------+
| num1       |
+------------+
| 41188      |
+------------+
-- The number of data records in the partition for which the value of credit is yes in bank_data_pt. 
+------------+
| num2       |
+------------+
| 3          |
+------------+
-- The number of data records in the partition for which the value of credit is unknown in bank_data_pt. 
+------------+
| num3       |
+------------+
| 8597       |
+------------+
-- The number of data records in the partition for which the value of credit is no in bank_data_pt. 
+------------+
| num4       |
+------------+
| 32588      |
+------------+

The returned numbers are consistent with the numbers of data records in the sample files. This indicates that data in the sample files is completely imported to the tables.

What to do next

After the data is imported to MaxCompute tables, you can run SQL statements on the MaxCompute client to process the data and run commands to export the result data. For more information about how to execute SQL statements and export result data, see Execute SQL statements and export the result data.