TableTunnel Upload & Download Architecture Overview - MaxCompute

TableTunnel is the entry class of the MaxCompute Tunnel service for uploading and downloading table data. Views are not supported.

The lifecycle of a TableTunnel instance starts from the time it is created to the time data upload or download is complete.

API definition

For the full Javadoc, see Java SDK reference.

public class TableTunnel {
  public UploadSession createUploadSession(String projectName, String tableName, boolean overwrite);
  public UploadSession createUploadSession(String projectName, String tableName, PartitionSpec partitionSpec, boolean overwrite);
  public UploadSession getUploadSession(String projectName, String tableName, String id);
  public UploadSession getUploadSession(String projectName, String tableName, PartitionSpec partitionSpec, String id);

  public DownloadSession createDownloadSession(String projectName, String tableName);
  public DownloadSession createDownloadSession(String projectName, String tableName, PartitionSpec partitionSpec);
  public DownloadSession getDownloadSession(String projectName, String tableName, String id);
  public DownloadSession getDownloadSession(String projectName, String tableName, PartitionSpec partitionSpec, String id);
}

Key concepts

Session: A session represents one upload or download operation on a table or partition. Internally, it wraps one or more HTTP requests to the Tunnel RESTful APIs. Each session has a unique session ID and remains valid for 24 hours.

UploadSession: Manages data upload. Created by TableTunnel.createUploadSession() and closed by session.commit().

DownloadSession: Manages data download. Created by TableTunnel.createDownloadSession().

RecordWriter: Handles writes within an upload session. Each RecordWriter maps to one HTTP request and is identified by a block ID, which also serves as the name of the corresponding file.

Block ID: A numeric identifier for each data block in an upload session. Valid range: 0 to 19,999 (inclusive).

Upload process

An upload session moves data through three stages, each triggered by a specific method call:

RecordWriter.write() — writes data as files to a temporary directory.
RecordWriter.close() — moves files from the temporary directory to a data directory.
session.commit() — moves all files into the final table directory and updates table metadata, making the data visible to other MaxCompute jobs (SQL, MapReduce).

INSERT INTO vs. INSERT OVERWRITE

The overwrite parameter in createUploadSession() controls how data is written:

`overwrite` value	Equivalent SQL	Behavior
Not specified	INSERT INTO	Sessions on the same table or partition are independent; each session's data is saved in a separate directory.
`false`	INSERT INTO	Same as not specified.
`true`	INSERT OVERWRITE	All existing data in the table or partition is replaced by the current session's data. Do not run concurrent upload sessions on the same table or partition.

Limits

Constraint	Limit	Guidance
Block ID range	0–19,999	Block IDs outside this range are rejected.
Data per block	100 GB maximum	Split large uploads across multiple blocks.
Session lifecycle	24 hours	For datasets that take longer to transfer, split the upload across multiple sessions.
HTTP request (RecordWriter)	120-second idle timeout	If no data flows over the HTTP connection for 120 seconds, the server closes it. HTTP buffers 8 KB of data before sending — call `TunnelRecordWriter.flush()` to force-flush the buffer if needed.
RecordReader lifecycle	300 seconds	—
RecordWriter batch size	64 MB threshold	When cached data reaches 64 MB, write multiple records in a single `RecordWriter` call. Do not create one `RecordWriter` per record — this generates large numbers of small files and degrades performance.

Retry and fault tolerance

If a block upload fails, re-open a RecordWriter with the same block ID and retry. The last successful close() call for a given block ID overwrites all previous data for that block — use this behavior to retransmit failed blocks without duplicating data.

If a session times out before all data is uploaded, start a new session for the remaining data.

Billing and permissions

Downloads through public endpoints are billed. For endpoint details, see Endpoints.

If the download control feature is enabled, users on public endpoints must have the corresponding download permissions before downloading data. For details, see Download control.