All Products
Search
Document Center

MaxCompute:FAQ about Tunnel commands

Last Updated:Mar 26, 2026

This topic answers common questions about the MaxCompute Tunnel Upload and Tunnel Download commands.

Quick reference: Limits and defaults

ParameterLimit or default
Max upload duration per session2 hours
Max data record size200 MB
Data compressionEnabled by default (charged based on compressed size)
Session lifecycle24 hours
Max blocks per session20,000
Recommended block sizeGreater than 64 MB
Recommended session creation intervalAt least 5 minutes
History records stored500 (maximum)
Normal upload/download speed1–20 MB/s

Tunnel Upload

Does Tunnel Upload support wildcards or regular expressions?

No. Use a shell script to loop through files and call Tunnel Upload for each file individually.

Is there a file size limit? Is the record size limited? Is data compressed?

There is no explicit file size limit, but each upload session must complete within 2 hours. Estimate your maximum upload size based on your network speed and this time limit.

Each data record cannot exceed 200 MB.

Data is compressed before upload by default. To disable compression when bandwidth allows, set -cp false. Billing is based on the size of the compressed data.

Can I upload multiple files to the same table or partition simultaneously?

Yes. Multiple files can be uploaded to the same table or partition at the same time.

Can multiple clients upload to the same table simultaneously?

Yes. Multiple clients can upload to the same table at the same time. Each concurrent upload operates independently.

Does the destination partition need to exist before upload?

Yes. By default, the partition must already exist. To create the partition automatically if it does not exist, set -acp to true. The default value is false. For more information, see Tunnel commands.

Am I charged based on compressed or uncompressed data size?

You are charged based on the compressed data size.

Can I control the upload speed?

No. Upload speed depends on your network bandwidth and server performance.

How do I speed up uploads?

Use the -threads parameter to upload data in parallel. The following example splits data into 10 threads:

tunnel upload C:\userlog.txt userlog1 -threads 10 -s false -fd "\u0000" -rd "\n";

For large datasets (such as 10 GB of daily logs), upload by partition or table, and use multiple Elastic Compute Service (ECS) instances to parallelize the work.

I configured the cloud product interconnection network Tunnel endpoint, but the project still connects to the public endpoint. Why?

Configure the Tunnel endpoint in the odps_config.ini file of the MaxCompute client, in addition to the MaxCompute endpoints. For endpoint configuration details, see Endpoints.

You do not need to configure a Tunnel endpoint for MaxCompute projects in the China (Shanghai) region.

Error: FAILED: error occurred while running tunnel command on DataStudio

DataStudio does not support the Tunnel Upload command. Use the DataWorks visualized data import feature instead. For more information, see Import data to a MaxCompute table.

Why does upload fail when data contains line feeds or spaces?

Tunnel Upload treats the row delimiter and column delimiter literally, so line feeds or spaces in the data are interpreted as delimiters. Replace the delimiters in the data with custom characters and use the -rd (row delimiter) and -fd (column delimiter) parameters to specify the replacements.

For example, if your data uses @ as a row delimiter and , as a column delimiter:

Data file (d:\data.txt):

shopx,x_id,100@
shopy,y_id,200@
shopz,z_id,300@

Upload command:

tunnel upload d:\data.txt sale_detail/sale_date=201312,region=hangzhou -s false -rd "," -fd "@";

If you cannot change the delimiters, upload the data as a single column using \u0000 as the column delimiter, then parse it with a user-defined function (UDF).

What do I do if an out of memory (OOM) error occurs during upload?

An OOM error usually means the row or column delimiters are configured incorrectly, causing all data to be treated as a single row and loaded into memory at once.

Upload a small sample of your data to confirm the delimiters work correctly with the -td and -fd parameters. If the sample succeeds, upload the full dataset.

How do I upload all files in a folder at once?

Tunnel Upload supports uploading a single file or all files in a single top-level directory. To upload all files in d:\data:

tunnel upload d:\data sale_detail/sale_date=201312,region=hangzhou -s false;

For more information, see Usage notes.

How do I upload multiple files to different partitions in a table?

Use a shell script to loop through files and upload each one to a separate partition. The following example runs on Windows (Linux is similar):

#!/bin/sh
C:/odpscmd_public/bin/odpscmd.bat -e "create table user(data string) partitioned by (dt int);"
dir=$(ls C:/userlog)
pt=0
for i in $dir
do
    let pt=pt+1
    echo $i
    echo $pt
    C:/odpscmd_public/bin/odpscmd.bat -e "alter table user add partition (dt=$pt);tunnel upload C:/userlog/$i user/dt=$pt -s false -fd "%" -rd "@";"
done

The following figure shows the output after uploading the userlog1 and userlog2 files.回显

After the upload, query the table to verify the data.回显2

Does Tunnel Upload have a parameter to skip dirty data, like MySQL's -f parameter?

Yes. Set -dbr true to skip dirty data (rows with extra columns, missing columns, or type mismatches). The default is false, which stops the upload when dirty data is encountered. For more information, see Upload.

Error: StatusConflict — upload or download status conflict

ErrorCode=StatusConflict, ErrorMessage=You cannot complete the specified operation under the current upload or download status.

An upload is already in progress for that session. Wait for the current upload to finish before retrying.

Error: Error writing request body to server

java.io.IOException: Error writing request body to server

This error occurs when data cannot be written to the server, usually due to network timeouts:

  • If data is read from a database or remote store before upload, the read may take more than 600 seconds, causing a timeout. Fetch the data first, then call the Tunnel SDK to upload it.

  • If you are using a public endpoint, network instability may cause the timeout. Switch to a VPC or cloud product interconnection network endpoint.

Each block can contain 64 MB to 100 GB of data. To reduce timeout risk, keep each block under 10,000 records. A session supports a maximum of 20,000 blocks. If uploading from ECS instances, use internal endpoints. For more information, see Endpoints.

Error: NoSuchPartition — the specified partition does not exist

ErrorCode=NoSuchPartition, ErrorMessage=The specified partition does not exist

The target partition does not exist. Check partitions with:

show partitions table_name;

Create the missing partition with:

alter table table_name add [if not exists] partition partition_spec;

Error: Column Mismatch during upload

This error usually means invalid row delimiters are causing multiple records to be read as one. Check and fix the -rd parameter. Also verify the file does not end with extra line feeds.

If the error only appears when uploading a whole folder but not individual files, set -dbr=false -s true to surface the format issue.

Error: ODPS-0110061 in multi-thread upload scenarios

ODPS-0110061: Failed to run ddltask - OTS transaction exception - Transaction timeout because cannot acquire exclusive lock.

Too many concurrent write operations are hitting the same table. Reduce the number of parallel write threads, add a delay between requests, and implement retry logic.

How do I skip the table header in a CSV file?

Set -h true in the Tunnel Upload command.

A large portion of data is missing after uploading a CSV file. Why?

This is caused by an invalid encoding format or incorrect delimiters. Fix the encoding or delimiters in the source file before re-uploading.

How do I upload data from a TXT file using a shell script?

Use the -e flag with odpscmd to run the Tunnel Upload command inline:

...\odpscmd\bin>odpscmd -e "tunnel upload "$FILE" project.table"

For more information about odpscmd startup parameters, see MaxCompute client (odpscmd).

When uploading two files, the second file is silently skipped after the first succeeds. Why?

Remove --scan=true from your Tunnel Upload command. This parameter causes an error during parameter passing in rerun mode, silently skipping subsequent files.

Block 22 failed, but the retry started from block 23. Is block 22 data lost?

Each block is an independent HTTP request. Parallel uploads are atomic at the block level. If the retry count for block 22 exceeded the limit, the system moved on to the next block.

After the upload completes, run the following statement to check for missing data:

select count(*) from table;

My oversized table upload times out before the session expires. What do I do?

Split the upload into two subtasks, each completing within the 24-hour session lifecycle.

Uploads are slow when there are many active sessions. What do I do?

Limit session creation to one session every 5 minutes or more, and set each block size to greater than 64 MB. The maximum block ID per session is 20,000, and data is only visible after the session is committed.

Why are there extra \r characters at the end of uploaded data?

Windows line endings are \r\n, while Linux and macOS use \n. Tunnel uses the operating system's default line ending as the row delimiter. If the file was created on Windows and uploaded from Linux or macOS, \r characters are included in the uploaded data.

The column delimiter is a comma, but field values also contain commas. What do I do?

Change the column delimiter in the data file to a character that does not appear in field values, then use the -fd parameter to specify the new delimiter.

Data contains spaces as column delimiters or needs regex-based filtering. What do I do?

Tunnel Upload does not support regular expressions. Handle this in two steps:

  1. Upload the entire line as a single string column using \u0000 as the column delimiter, so the data is not split:

    create table userlog1(data string);
    tunnel upload C:\userlog.txt userlog1 -s false -fd "\u0000" -rd "\n";
  2. Write a Python or Java UDF to parse each line with a regular expression. The following example creates a UDF named ParseAccessLog:

    from odps.udf import annotate
    from odps.udf import BaseUDTF
    import re
    
    regex = '([(\d\.)]+) \[(.*?)\] - "(.*?)" (\d+) (\d+) (\d+) (\d+) "-" "(.*?)" - (.*?) - - (.*?) (.*?) - - - -'
    
    @annotate('string -> string,string,string,string,string,string,string,string,string,string,string')
    class ParseAccessLog(BaseUDTF):
        def process(self, line):
            try:
                t = re.match(regex, line).groups()
                self.forward(t[0], t[1], t[2], t[3], t[4], t[5], t[6], t[7], t[8], t[9], t[10])
            except:
                pass

    For more information, see Develop a Python UDF or Develop a Java UDF.

  3. Use the UDF to process the uploaded data and write the results to a new table:

    create table userlog2 as select ParseAccessLog(data) as (ip,date,request,code,c1,c2,c3,ua,q1,q2,q3) from userlog1;

There is dirty data after upload. What do I do?

Write all data to a non-partitioned table or to a single partition in one operation. Writing to the same partition multiple times can produce dirty data.

To find dirty data, run:

tunnel show bad <sessionid>;

To remove dirty data, use one of the following approaches:

  • Drop and re-create the table or partition:

    drop table table_name;
    -- or
    alter table table_name drop partition partition_spec;

    Then re-upload the data.

  • If the dirty data can be isolated with a WHERE clause, use an INSERT statement to write clean data into a new table or overwrite the existing partition:

    insert overwrite table target_table [partition partition_spec]
    select * from source_table where <filter condition>;

How do I upload data by using Tunnel?

To upload data by using Tunnel, perform the following steps:

  1. Prepare the source data, such as a source file or a source table.

  2. Design the table schema, specify partition definitions, convert data types, and then create a table on the MaxCompute client.

  3. Add partitions to the MaxCompute table. If the table is a non-partitioned table, skip this step.

  4. Upload data to the specified partition or table.

Tunnel Download

Which file formats does Tunnel Download support?

Tunnel Download exports data in TXT or CSV format only.

I'm downloading data within my project's region, but I'm being charged. Why?

Downloads are routed over the Internet. To avoid charges, configure either a virtual private cloud (VPC) Tunnel endpoint or a cloud product interconnection network Tunnel endpoint. Without one of these endpoints, traffic may be routed out of region.

What do I do if downloads time out?

The most common cause is an invalid Tunnel endpoint. To check whether the Tunnel endpoint is valid, use Telnet to test the network connectivity.

If connectivity fails, check your endpoint configuration in odps_config.ini.

Error: You have NO privilege 'odps:Select'

You have NO privilege 'odps:Select' on {acs:odps:*:projects/XXX/tables/XXX}. project 'XXX' is protected.

The MaxCompute project has data protection enabled. Only the project owner can export data from a protected project to another project.

How do I download only specific rows or columns?

Tunnel does not support filtering or computing data during download. Use one of the following approaches:

  • Run an SQL job to save the filtered data to a temporary table, download from the temporary table, then delete it.

  • For small datasets, run an SQL query directly without downloading.

Tunnel history

How long is Tunnel command history retained?

History retention is not time-based. By default, up to 500 records are stored.

Other issues

Can Tunnel directory names contain Chinese characters?

Yes.

What should I know about delimiters in Tunnel commands?

  • -rd specifies the row delimiter; -fd specifies the column delimiter.

  • The column delimiter cannot contain the row delimiter.

  • The default delimiter on Windows is \r\n; on Linux it is \n.

  • Starting from MaxCompute client V0.21.0, the row delimiter in use is displayed when an upload starts, so you can confirm it before proceeding.

Can Tunnel file paths contain spaces?

Yes. Enclose the path in double quotation marks when it contains spaces:

tunnel upload "C:\my folder\data.txt" table_name;

Does Tunnel support .dbf database files?

No. Tunnel supports text files only. Binary files, including .dbf files, are not supported.

What are the valid speed ranges for Tunnel uploads and downloads?

Under normal network conditions, upload and download speeds range from 1 MB/s to 20 MB/s. Actual speed depends on network bandwidth and server performance.

How do I find the public Tunnel endpoints?

Public Tunnel endpoints vary by region and network type. For the full list, see Endpoints.

What do I do if Tunnel uploads or downloads fail?

  1. Get the Tunnel endpoint from odps_config.ini in the ..\odpscmd_public\conf directory.

  2. Test connectivity from the MaxCompute client CLI:

    curl -i <tunnel_endpoint>
  3. If connectivity fails, check your network or switch to a valid endpoint.

Error: Java heap space FAILED

Java heap space FAILED: error occurred while running tunnel command

During upload: Check whether the delimiters are correct. If all records are being loaded as a single row due to an invalid delimiter, memory fills up quickly.

During upload or download: If the row size or data volume is genuinely large, increase the JVM heap size. Edit the odpscmd script in the bin directory of the client installation and increase the -Xms and -Xmx values:

java -Xms64m -Xmx512m -classpath "${clt_dir}/lib/*:${clt_dir}/conf/" com.aliyun.openservices.odps.console.ODPSConsole "$@"

Set -Xmx to a value appropriate for your data size.

Does a session have a lifecycle? What happens when it expires?

Yes. Each session has a 24-hour lifecycle starting from creation. After a session expires, it cannot be used. Create a new session to continue writing data.

Can multiple processes or threads share the same session?

Yes, but each block ID must be unique across all processes and threads using the session.

What is the Tunnel routing feature?

If no Tunnel endpoint is configured, traffic is automatically routed to the Tunnel endpoint of the network where MaxCompute resides. If a Tunnel endpoint is configured, all traffic goes to that endpoint and automatic routing is disabled.

Does Tunnel support parallel uploads and downloads?

Yes. Use the --threads parameter to run parallel operations:

tunnel upload E:/1.txt tmp_table_0713 --threads 5;

How do I synchronize data of the GEOMETRY type to MaxCompute?

MaxCompute does not support the GEOMETRY data type. Convert GEOMETRY data to STRING before uploading.

GEOMETRY is not supported in the standard Java Database Connectivity (JDBC) framework, so special handling is required when importing or exporting GEOMETRY data.