All Products
Search
Document Center

MaxCompute:Examples of using Tunnel SDK for Python

Last Updated:Dec 20, 2023

MaxCompute Tunnel is a tunnel service that you can use to upload data to or download data from MaxCompute. Tunnel SDK for Python is included in PyODPS. This topic provides examples to demonstrate how to upload data to and download data from MaxCompute by using Tunnel SDK for Python.

Usage notes

  • The following sections provide examples on how to upload data to and download data from MaxCompute by using the SDK for Python. For more information about examples in other scenarios, see SDK for Python documentation.

  • In a Cython environment, PyODPS compiles C code during installation to accelerate the Tunnel-based data upload and download.

Example of data upload

import os
from odps import ODPS
from odps.tunnel import TableTunnel

# Set the environment variable ALIBABA_CLOUD_ACCESS_KEY_ID to the AccessKey ID of the Alibaba Cloud account. 
# Set the environment variable ALIBABA_CLOUD_ACCESS_KEY_SECRET to the AccessKey secret of the Alibaba Cloud account. 
# We recommend that you do not directly use your AccessKey ID or AccessKey secret.
o = ODPS(
    os.getenv('ALIBABA_CLOUD_ACCESS_KEY_ID'),
    os.getenv('ALIBABA_CLOUD_ACCESS_KEY_SECRET'),
    project='your-default-project',
    endpoint='your-end-point',
)

table = o.get_table('my_table')

tunnel = TableTunnel(o)
upload_session = tunnel.create_upload_session(table.name, partition_spec='pt=test')

with upload_session.open_record_writer(0) as writer:
    record = table.new_record()
    record[0] = 'test1'
    record[1] = 'id1'
    writer.write(record)

    record = table.new_record(['test2', 'id2'])
    writer.write(record)

# You must execute the following statement outside the WITH code block. If you execute the following statement before data is written, an error is reported.
upload_session.commit([0])

Example of data download

from odps.tunnel import TableTunnel

tunnel = TableTunnel(odps)
download_session = tunnel.create_download_session('my_table', partition_spec='pt=test')

with download_session.open_record_reader(0, download_session.count) as reader:
     for record in reader:
         # Process each record.

with download_session.open_arrow_reader(0, download_session.count) as reader:
     for batch in reader:
         # Process each Arrow RecordBatch.