This topic describes the features of Tunnel commands and how to use Tunnel commands to upload or download data.
Features
The MaxCompute client provides Tunnel commands that implement the features of the Dship tool. You can use Tunnel commands to upload or download data. This section describes the features of the following Tunnel commands:
- UPLOAD: uploads local data to a MaxCompute table. You can upload files or level-1 directories to only one table or partition in a table each time. For a partitioned table, you must specify the partition to which you want to upload data. For a multi-level partitioned table, you must specify a lowest-level partition. For more information, see Upload.
- DOWNLOAD: downloads MaxCompute table data or the query results of a specific instance to a local directory. You can download data from only one table or partition to a single local file each time. For a partitioned table, you must specify the partition from which you want to download data. For a multi-level partitioned table, you must specify a lowest-level partition. For more information, see Download.
- RESUME: resumes the transfer of files or directories when your network is disconnected or the Tunnel service is faulty. You cannot use this command to resume data downloads. For more information, see Resume.
- SHOW: displays historical task information. For more information, see Show.
- PURGE: clears the session directory. Logs from the last three days are deleted by default. For more information, see Purge.
- HELP: queries help information. Command aliases are supported.
Limits of Tunnel commands
- Tunnel commands and Tunnel SDK cannot be used for external tables. You can use Tunnel to upload data to MaxCompute internal tables. You can also use OSS SDK for Python to upload data to OSS and map the data to external tables in MaxCompute. For more information about external tables, see Overview.
- You cannot use Tunnel commands to upload or download data of the ARRAY, MAP, or STRUCT type.
- On the server, the lifecycle for a session is 24 hours after a session is created. A session can be shared among processes and threads on the server, but you must make sure that each block ID is unique.
- If you download data by session, only data of the session created by using your Alibaba Cloud account can be downloaded by this account or its RAM users.
Upload and download table data
Before you upload table data, you must obtain the data file data.txt that contains the table data you want to upload. The save path is d:\data.txt. The following example shows the data in the file.
shopx,x_id,100
shopy,y_id,200
shopz,z_id
Note In this example, the third row of the data.txt file does not comply with the schema
of the sale_detail table. The sale_detail table defines three columns, but the third
row of this file contains only two columns.
To upload and download table data, perform the following steps:
- On the MaxCompute client, execute the following statements to create a partitioned
table named sale_detail and add partitions to the table.
CREATE TABLE IF NOT EXISTS sale_detail( shop_name STRING, customer_id STRING, total_price DOUBLE) PARTITIONED BY (sale_date STRING,region STRING); alter table sale_detail add partition (sale_date='201312', region='hangzhou');
- Run the
UPLOAD
command to upload the data.txt file to the partitioned table sale_detail.- Sample statement
tunnel upload d:\data.txt sale_detail/sale_date=201312,region=hangzhou -s false
- Returned result
Upload session: 20150610xxxxxxxxxxx70a002ec60c Start upload:d:\data.txt Total bytes:41 Split input to 1 blocks 2015-06-10 16:39:22 upload block: '1' ERROR: column mismatch -,expected 3 columns, 2 columns found, please check data or delimiter
Note The data fails to be imported because the data.txt file contains dirty data. The system returns the session ID and an error message.
- Sample statement
- Execute the following statement to verify the data upload.
- Sample statement
select * from sale_detail where sale_date='201312';
- Returned result
ID = 20150610xxxxxxxxxxxvc61z5 +-----------+-------------+-------------+-----------+--------+ | shop_name | customer_id | total_price | sale_date | region | +-----------+-------------+-------------+-----------+--------+ +-----------+-------------+-------------+-----------+--------+
Note The data fails to be imported because the data.txt file contains dirty data. As a result, the table is empty.
- Sample statement
- Run the
SHOW
command to query the ID of the failed session.- Sample statement
tunnel show history;
- Returned result
20150610xxxxxxxxxxx70a002ec60c failed 'u --config-file /D:/console/conf/odps_config.ini --project odpstest_ay52c_ay52 --endpoint http://service.cn-shanghai.maxcompute.aliyun.com/api --id UlxxxxxxxxxxxrI1 --key 2m4r3WvTxxxxxxxxxx0InVke7UkvR d:\data.txt sale_detail/sale_date=201312,region=hangzhou -s false'
- Sample statement
- Modify the content in the data.txt file to the following information:
shopx,x_id,100 shopy,y_id,200
- Run the
RESUME
command to resume the data upload. 20150610xxxxxxxxxxx70a002ec60c is the ID of the failed session.- Sample statement
tunnel resume 20150610xxxxxxxxxxx70a002ec60c --force;
- Returned result
start resume 20150610xxxxxxxxxxx70a002ec60c Upload session: 20150610xxxxxxxxxxx70a002ec60c Start upload:d:\data.txt Resume 1 blocks 2015-06-10 16:46:42 upload block: '1' 2015-06-10 16:46:42 upload block complete, blockid=1 upload complete, average speed is 0 KB/s OK
- Sample statement
- Execute the following statement to verify the data upload:
- Sample statement
select * from sale_detail where sale_date='201312';
- Returned result
ID = 20150610xxxxxxxxxxxa741z5 +-----------+-------------+-------------+-----------+--------+ | shop_name | customer_id | total_price | sale_date | region | +-----------+-------------+-------------+-----------+--------+ | shopx | x_id | 100.0 | 201312 | hangzhou| | shopy | y_id | 200.0 | 201312 | hangzhou| +-----------+-------------+-------------+-----------+--------+
- Sample statement
- Run the
DOWNLOAD
command to download data from the sale_detail table to the result.txt file.tunnel download sale_detail/sale_date=201312,region=hangzhou result.txt;
- Check whether the result.txt file contains the following content. If the file contains the following content,
the data download succeeds.
shopx,x_id,100.0 shopy,y_id,200.0
Download data of an instance
- Method 1: Run the Tunnel DOWNLOAD command to download the query results of a specific instance to a local file.
- Execute the SELECT statement to query the sale_detail table.
select * from sale_detail;
- Run the following command to download the query results to a local file:
tunnel download instance://20170724071705393ge3csfb8 result;
- Execute the SELECT statement to query the sale_detail table.
- Method 2: Configure the required parameter to download the query results by using
InstanceTunnel.
After you enable
use_instance_tunnel
on the MaxCompute client, you can use InstanceTunnel to download the query results that are returned by SELECT statements. This helps you download query results of any size at any time. You can use one of the following methods to enable this feature:- Log on to the MaxCompute client of the latest version. Find the odps_config.ini file and make sure that use_instance_tunnel is set to true and
instance_tunnel_max_record
is set to 10000.# download sql results by instance tunnel use_instance_tunnel=true # the max records when download sql results by instance tunnel instance_tunnel_max_record=10000
Note Theinstance_tunnel_max_record
parameter specifies the maximum number of SQL result records that can be downloaded by using InstanceTunnel. If you do not specify this parameter, the number of query results that can be downloaded is unlimited. - Set
console.sql.result.instancetunnel
to true.- Enable InstanceTunnel.
set console.sql.result.instancetunnel=true;
- Execute the SELECT statement.
select * from wc_in;
The following result is returned:ID = 20170724081946458g14csfb8 Log view: http://logview/xxxxx..... +------------+ | key | +------------+ | slkdfj | | hellp | | apple | | tea | | peach | | apple | | tea | | teaa | +------------+ A total of 8 records fetched by instance tunnel.
Note If InstanceTunnel is used, a message appears at the end of the query results. The message indicates the total number of records downloaded after you execute a SELECT statement. In this example, eight records are downloaded. You can setconsole.sql.result.instancetunnel
to false to disable this feature. - Enable InstanceTunnel.
- Log on to the MaxCompute client of the latest version. Find the odps_config.ini file and make sure that use_instance_tunnel is set to true and