Upload full data - Artificial Intelligence Recommendation

MaxCompute provides multiple methods for data uploads. You can also use the Data Integration service provided by DataWorks or run a Tunnel command to upload full data. For more information, see Data upload scenarios and tools.

This topic describes how to run a Tunnel command to upload data from an on-premises machine to a MaxCompute project.

Procedure

Download sample full data

1. Find the download link for sample full data based on the industry attribute of the instance that you want to create.

2. Download three tables that contain the sample full data to your on-premises machine.

In this topic, an instance of the news industry is used. You can download the

E-commerce industry, E-commerce industry, and E-commerce industry. Then, store the tables in your Object Storage Service (OSS) bucket.

Install the MaxCompute client

The client is used to create tables or upload data in subsequent steps.
You can refer to the MaxCompute client (odpscmd) topic to install the MaxCompute client.
You can also refer to MaxCompute to install the MaxCompute client.

Create three tables in MaxCompute

You can create MaxCompute tables by using one of the following methods:

Use the DataWorks console.
Use the MaxCompute client.

Create MaxCompute tables by using the MaxCompute client.
Enter the table creation statement in the command-line interface (CLI). The statement must end with a semicolon (;).Make sure that no line feeds are used in the table creation statement.
For more information about the specifications of table creation statements for various industries, see the following topics:
Create MaxCompute tables in the DataWorks console.
For more information, see DataWorks.

Upload data

1. Start the MaxCompute client.

2. Run the following Tunnel command to upload data. For more information about Tunnel commands, see Tunnel commands.

tunnel upload -acp=true -h=true /Users/xxx/workspace/data/news/behavior_news.csv behavior_airec_test/ds=20190125

In this example, a partitioned table is created, in which the partition key column is the ds column. If you want to upload a table, you must attach the partition information to the end of the table name, such as behavior_airec_test/ds=20190125.

The -h parameter specifies that the table header is not uploaded. The -acp parameter specifies that a partition is automatically created if a specific partition does not exist.

3. Check whether the data in the table is uploaded.

select * from behavior_airec_test where ds = 20190125;