MaxCompute allows you to use MaxFrame on an on-premises MaxFrame client that encapsulates the functionalities of the MaxFrame SDK. This topic describes how to install, configure, and use MaxFrame in an on-premises environment.
Prerequisites
Python 3.7 or 3.11 is installed. Other versions may cause errors.
pip is installed. If pip is not pre-installed with your Python version, visit the Python official website for installation instructions.
A MaxCompute project is created. For more information, see Create a MaxCompute project.
Tip: We recommend using a Python virtual environment to avoid dependency conflicts with other projects. You can create one with python -m venv maxframe-env and activate it before running the install command.Install MaxFrame
Open your system CLI (for example, Command Prompt on Windows) and run the following command:
pip install --upgrade maxframeTo verify the installation, run the following command. If no output or error message is returned, the installation is successful:
python -c "import maxframe.dataframe as md"If an error is reported, check your default Python version and switch to Python 3.7 or 3.11:
# Check the default Python version.
python --version
# Switch to Python 3.7. Replace $path/python3.7 with the actual installation path of Python 3.7.
$path/python3.7 -m pip install setuptools>=3.0Configure credentials
MaxFrame connects to MaxCompute using an AccessKey pair. Store your credentials in environment variables so they are not hard-coded in your scripts.
Set the following environment variables to the AccessKey ID and AccessKey secret of the Alibaba Cloud account that has the required MaxCompute permissions on the objects you want to manage in your MaxCompute project:
ALIBABA_CLOUD_ACCESS_KEY_ID-- your AccessKey IDALIBABA_CLOUD_ACCESS_KEY_SECRET-- your AccessKey secret
You can obtain your AccessKey pair from the AccessKey Pair page in the Alibaba Cloud Management Console.
Run a sample script
Create a
.pyfile in your on-premises environment, such astest.py, and add the following sample code: Replace the following placeholders with your actual values:Placeholder Description your-default-projectThe name of your MaxCompute project. To find the project name, log on to the MaxCompute console and choose Workspace > Projects in the left-side navigation pane. your-end-pointThe endpoint of the region in which your MaxCompute project resides. For more information, see Endpoints. import os import maxframe.dataframe as md from odps import ODPS from maxframe import new_session # Create a MaxCompute entry. o = ODPS( # Read credentials from environment variables. # Do not hard-code your AccessKey ID and AccessKey secret in source files. os.getenv('ALIBABA_CLOUD_ACCESS_KEY_ID'), os.getenv('ALIBABA_CLOUD_ACCESS_KEY_SECRET'), project='your-default-project', endpoint='your-end-point', ) table = o.create_table("test_source_table", "a string, b bigint", if_not_exists=True) with table.open_writer() as writer: writer.write([ ["value1", 0], ["value2", 1], ]) # Create a MaxFrame session. session = new_session(o) df = md.read_odps_table("test_source_table", index_col="b") df["a"] = "prefix_" + df["a"] # Print DataFrame data. print(df.execute().fetch()) # Write data from a MaxFrame DataFrame to a MaxCompute table. md.to_odps_table(df, "test_prefix_source_table").execute() # Destroy the MaxFrame session. session.destroy()Navigate to the directory that contains
test.pyand run the file:python test.py
Verify the result
After you run the script, verify that MaxFrame is working correctly by checking the following outputs.
Script output: The script prints the following DataFrame result:
b a
0 prefix_value1
1 prefix_value2Table query: Execute the following SQL statement in your MaxCompute project to query data in the test_prefix_source_table table:
SELECT * FROM test_prefix_source_table;The following result is returned:
+------------+------------+
| b | a |
+------------+------------+
| 0 | prefix_value1 |
| 1 | prefix_value2 |
+------------+------------+If both outputs match the expected results above, MaxFrame is working correctly in your on-premises environment.