When you develop Python code with MaxFrame, you often need to reference third-party packages or images. MaxFrame provides an interface that lets you directly use uploaded packages and images for quick access.
Reference a third-party package in MaxFrame development
Upload a third-party package.
NoteFor more information about creating a third-party package, see Create a third-party package for PyODPS.
Before you use a third-party package, ensure that the package is uploaded to MaxCompute as an Archive resource. This topic uses the packages.tar.gz third-party package as an example. You can use one of the following methods to upload the package:
Upload the package using code. The following code provides an example.
import os from odps import ODPS # Make sure the ALIBABA_CLOUD_ACCESS_KEY_ID environment variable is set to your AccessKey ID. # Make sure the ALIBABA_CLOUD_ACCESS_KEY_SECRET environment variable is set to your AccessKey secret. # Do not use the AccessKey ID or AccessKey secret strings directly. o = ODPS( os.getenv('ALIBABA_CLOUD_ACCESS_KEY_ID'), os.getenv('ALIBABA_CLOUD_ACCESS_KEY_SECRET'), project='<YOUR-DEFAULT-PROJECT>', endpoint='<YOUR-END-POINT>', ) # Replace packages.tar.gz with the path and file name of the target package. o.create_resource("packages.tar.gz", "archive", fileobj=open("packages.tar.gz", "rb"))Upload the resource using DataWorks. For more information, see Create and use MaxCompute resources.
Reference the third-party package in MaxFrame development. For more information, see Automatic packaging service.
MaxFrame lets you reference third-party packages and files during job development using a declaration. To do this, import the
with_resource_librariesmodule, as shown in the following example.from maxframe.udf import with_resource_libraries @with_resource_libraries("packages.tar.gz", "demo.py")
Example
The following example shows how to reference a third-party package in a MaxFrame job. The example references the packages.tar.gz package to calculate the Population Stability Index (PSI) value for a column in the test_float_col test table.
Prepare the test table
test_float_coland test data.CREATE TABLE test_float_col (col1 double); INSERT INTO test_float_col VALUES (3.75),(2.51);Write MaxFrame code and save it as the
demo.pyfile on your local machine. The following code provides an example:# Calculate the PSI value of the col1 column in the test_float_col test table. import os from odps import ODPS, options from maxframe.session import new_session import maxframe.dataframe as md from maxframe.config import options from maxframe import config from maxframe.udf import with_resource_libraries # Reference the third-party package. @with_resource_libraries("packages.tar.gz") def my_psi(v): from scipy.special import psi return float(psi(v)) o = ODPS( # Make sure the ALIBABA_CLOUD_ACCESS_KEY_ID environment variable is set to your AccessKey ID. # Make sure the ALIBABA_CLOUD_ACCESS_KEY_SECRET environment variable is set to your AccessKey secret. # Do not use the AccessKey ID or AccessKey secret strings directly. os.getenv('ALIBABA_CLOUD_ACCESS_KEY_ID'), os.getenv('ALIBABA_CLOUD_ACCESS_KEY_SECRET'), project='your-default-project', endpoint='your-end-point' ) # Create a MaxFrame session. session = new_session(o) df = md.read_odps_table('test_float_col') # Execute the code and obtain the result. print(df.col1.map(my_psi).execute().fetch())Run the demo.py file on the local MaxFrame client. The following command provides an example:
python demo.pyThe following result is returned.
0 1.182537 1 0.708048 Name: col1, dtype: float64
Reference an image in MaxFrame development
The following example shows how to use an image in a MaxFrame job. The example references the built-in scipy image to calculate the PSI value for a column in the test_float_col test table.
Prepare the test table
test_float_coland test data.CREATE TABLE test_float_col (col1 double); INSERT INTO test_float_col VALUES (3.75),(2.51);Write MaxFrame code and save it as the
demo.pyfile on your local machine. The following code provides an example.# Code to calculate the PSI value of the col1 column in the test_float_col test table. import os from odps import ODPS, options from maxframe.session import new_session import maxframe.dataframe as md from maxframe.config import options from maxframe import config # Reference the built-in scipy image. config.options.sql.settings = { "odps.session.image": "scipy" } def my_psi(v): from scipy.special import psi return float(psi(v)) o = ODPS( # Make sure the ALIBABA_CLOUD_ACCESS_KEY_ID environment variable is set to your AccessKey ID. # Make sure the ALIBABA_CLOUD_ACCESS_KEY_SECRET environment variable is set to your AccessKey secret. # Do not use the AccessKey ID or AccessKey secret strings directly. os.getenv('ALIBABA_CLOUD_ACCESS_KEY_ID'), os.getenv('ALIBABA_CLOUD_ACCESS_KEY_SECRET'), project='your-default-project', endpoint='your-end-point' ) # Create a MaxFrame session. session = new_session(o) df = md.read_odps_table('test_float_col') # Execute the code and obtain the result. print(df.col1.map(my_psi).execute().fetch())Run the
demo.pyfile on the local MaxFrame client. The following command provides an example:python demo.pyThe following result is returned.
0 1.182537 1 0.708048 Name: col1, dtype: float64