This topic introduces Python on MaxCompute (PyODPS) and describes the general usage of PyODPS.
Background information
PyODPS is MaxCompute SDK for Python. PyODPS supports the DataFrame framework and basic operations on MaxCompute objects. You can use PyODPS to analyze data in MaxCompute.
PyODPS supports Python 2.6 and later, and Python 3.
For more information about PyODPs, see the following documentation:
For more information about PyODPS, see PyODPS: ODPS Python SDK and data analysis framework and PyODPS-related articles.
For more information about how to download PyODPS, visit GitHub.
For more information about how to install PyODPS, see PyODPS installation instructions.
For more information about how to develop PyODPS, see PyODPS developer guide.
If you want to help build the PyODPS ecosystem, you can perform the following operations:
Write PyODPS documentation.
Develop PyODPS code at GitHub.
Click the link to join the DingTalk group for technical communication.
Initialization
Before you can use PyODPS, you must initialize a connection to MaxCompute by using your Alibaba Cloud account. To initialize a connection, run the following command:
import os
from odps import ODPS
# Set the environment variable ALIBABA_CLOUD_ACCESS_KEY_ID to your AccessKey ID.
# Set the environment variable ALIBABA_CLOUD_ACCESS_KEY_SECRET to the AccessKey secret of the Alibaba Cloud account.
o = ODPS(
os.getenv('ALIBABA_CLOUD_ACCESS_KEY_ID'),
os.getenv('ALIBABA_CLOUD_ACCESS_KEY_SECRET'),
project='your-default-project',
endpoint='your-end-point',
)
Parameters:
ALIBABA_CLOUD_ACCESS_KEY_ID: the AccessKey ID of your Alibaba Cloud account. Make sure that the account has the permissions to manage objects in a MaxCompute project. For more information about the permissions, see Operation permissions. In the upper-right corner of the MaxCompute console, click the profile picture and select AccessKey Management to obtain the AccessKey ID.
ALIBABA_CLOUD_ACCESS_KEY_SECRET: the AccessKey secret that corresponds to the AccessKey ID. In the upper-right corner of the MaxCompute console, click the profile picture and select AccessKey Management to obtain the AccessKey secret.
your-default-project: the name of your MaxCompute project. You can log on to the MaxCompute console. In the top navigation bar, select a region. In the left-side navigation pane, choose Workspace > Projects to view the name of the MaxCompute project.
your-end-point: the endpoint of the region where your MaxCompute project resides. For more information, see Endpoints.
Description
The following table describes the methods that you can use to perform basic operations on MaxCompute objects.
Item | Operation | Description |
Projects | get_project(project_name) | Obtains the name of a MaxCompute project. |
exist_project(project_name) | Checks whether a MaxCompute project exists. | |
Tables | list_tables() | Lists all tables in a MaxCompute project. |
exist_table(table_name) | Checks whether a table exists. | |
get_table(table_name, project=project_name) | Obtains a specified table. You can obtain a table from another MaxCompute project. | |
create_table() | Creates a table. | |
read_table() | Reads data from a table. | |
write_table() | Writes data to a table. | |
delete_table() | Deletes an existing table. | |
Table partitions | exist_partition() | Checks whether a partition exists. |
get_partition() | Obtains information about a partition. | |
create_partition() | Creates a partition. | |
delete_partition() | Deletes an existing partition. | |
SQL | execute_sql()/run_sql() | Executes SQL statements. |
open_reader() | Reads execution results of SQL statements. | |
Instances | list_instances() | Lists all instances in a MaxCompute project. |
exist_instance() | Checks whether an instance exists. | |
get_instance() | Obtains information about an instance. | |
stop_instance() | Terminates an instance. | |
Resources | create_resource() | Creates a resource. |
open_resource() | Opens a resource. | |
get_resource() | Obtains information about a resource. | |
list_resources() | Lists all existing resources. | |
exist_resource() | Checks whether a resource exists. | |
delete_resource() | Deletes an existing resource. | |
Functions | create_function() | Creates a function. |
delete_function() | Deletes an existing function. | |
Uploads and downloads tunnels | create_upload_session() | Creates a session that is used to upload data. |
create_download_session() | Creates a session that is used to download data. |