All Products
Search
Document Center

Upload full data

Last Updated: Oct 14, 2021

MaxCompute provides multiple methods for data uploads. You can also use the Data Integration service provided by DataWorks or run a Tunnel command to upload full data. For more information, see Overview of data uploads and downloads.

This topic describes how to run a Tunnel command to upload data from an on-premises machine to a MaxCompute project.

Procedure

Download sample full data

1. Find the download link for sample full data based on the industry attribute of the instance that you want to create.

2. Download three tables that contain the sample full data to your on-premises machine.

In this topic, an instance of the news industry is used. You can download the behavior table, item table, and user table. Then, store the tables in your Object Storage Service (OSS) bucket.

Install the MaxCompute client

1. The client is used to create tables or upload data in subsequent steps.

2. You can refer to the Client documentation to install the MaxCompute client.

3. You can also refer to the Authorize a RAM user topic to install the MaxCompute client.

Create three tables in MaxCompute

You can create MaxCompute tables by using the MaxCompute client.

Create a MaxCompute table by using the MaxCompute client

Enter the table creation statement in the command-line interface (CLI). The statement must end with a semicolon (;). In the following figure, the table creation statement is entered to create a behavior table.

Create three tables

Make sure that no line feeds are used in the table creation statement.

For more information about table creation statements for different industries, see item industry,

news industry,

and content industry

Upload data

1. Start the MaxCompute client.

2. Run the following Tunnel command to upload data. For more information about Tunnel commands, see Tunnel commands.

tunnel upload -acp=true -h=true /Users/xxx/workspace/data/news/behavior_news.csv behavior_airec_test/ds=20190125

In this example, a partitioned table is created, in which the partition key column is the ds column. If you want to upload a table, you must attach the partition information to the end of the table name, such as behavior_airec_test/ds=20190125.

Run a Tunnel command

The -h parameter specifies that the table header is not uploaded. The -acp parameter specifies that a partition is automatically created if a specific partition does not exist.

3. Check whether the data in the table is uploaded.

select * from behavior_airec_test where ds = 20190125;