PyODPS is MaxCompute SDK for Python. PyODPS provides the DataFrame framework and basic operations on MaxCompute objects to help you analyze data in MaxCompute by using Python. You can use PyODPS in DataWorks or an on-premises environment. This topic describes how to install PyODPS when you use PyODPS in an on-premises environment.
The version of Python meets requirements. We recommend that you use Python 3.6 or later. Python 2.7 or earlier is not recommended.
Run the following command to install PyODPS:
pip install pyodps
Run the following command to check whether the installation is successful: If no result is returned and no error is reported, the installation is successful.
python -c "from odps import ODPS"
If the Python version is not the default version, run the following command to switch to the default version after pip is installed:
/home/tops/bin/python3.7 -m pip install setuptools>=3.0 #/home/tops/bin/python3.7 is the directory in which Python is installed.
What to do next
We recommend that you install greenlet 0.4.10 or later to accelerate Tunnel-based data upload.
Initialize the MaxCompute entry point.
import os from odps import ODPS # Set the environment variable ALIBABA_CLOUD_ACCESS_KEY_ID to the AccessKey ID of your Alibaba Cloud account. # Set the environment variable ALIBABA_CLOUD_ACCESS_KEY_SECRET to the AccessKey secret of your Alibaba Cloud account. o = ODPS( os.getenv('ALIBABA_CLOUD_ACCESS_KEY_ID'), os.getenv('ALIBABA_CLOUD_ACCESS_KEY_SECRET'), project='your-default-project', endpoint='your-end-point', )
ALIBABA_CLOUD_ACCESS_KEY_ID and ALIBABA_CLOUD_ACCESS_KEY_SECRET: Set the two environment variables to the AccessKey ID and AccessKey secret of your Alibaba Cloud account separately.Note
We recommend that you use the environment variables rather than the AccessKey ID and AccessKey secret.
your-default-project and your-end-point: Replace them with the default project name and endpoint. For more information about the endpoints of each region, see Endpoints.
After you complete the preceding configurations, you can use PyODPS in your on-premises environment. For example, you can perform basic operations on MaxCompute objects, such as
delete. For more information about how to use PyODPS, see Overview of basic operations and Overview of DataFrame.
Unless otherwise specified, the o object in this topic is a MaxCompute object.