This topic describes the features and usage of MaxFrame local debug mode. This mode allows you to debug UDF functions such as a****() and a**********() directly in your local environment without connecting to a remote service.
Background
Traditionally, debugging MaxFrame UDF functions like apply() and apply_chunk() required submitting code to a remote cluster environment. This process prevented local debugging, such as setting breakpoints or performing step-by-step debugging. Consequently, each modification required resubmission, and developers often had to maintain separate codebases for local and production environments.
MaxFrame local debug mode solves these challenges by running UDF functions directly in your local Python environment. This approach allows for IDE breakpoint debugging, works entirely offline, and enables you to use a single codebase for both local debugging and production runs.
Use cases
Scenario | Description |
UDF logic development | Debug and verify complex business logic in real time. |
Data transformation testing | Validate data cleaning and transformation rules. |
Troubleshooting | Identify the root cause of UDF execution errors. |
Offline development | Continue development work without a network connection. |
Features
Compared to traditional remote debugging, local debug mode offers the following advantages:
Dimension | Local debug mode | Traditional approach |
Breakpoint debugging | Supports IDE breakpoint debugging | Not supported |
Remote dependency | Enables fully offline debugging | Requires connection to a remote cluster environment |
Debug cycle | Immediate local execution | Requires remote submission for each run |
Codebase | Single codebase | Requires maintaining multiple codebases |
Zero-configuration debugging
Simply set
debug=Trueordebug="local", with no additional tools or services required.session = new_session(o, debug=True)Full offline capability
Works without network connectivity or remote cluster resources.
Native IDE support
Supports popular IDEs like PyCharm and VSCode, as well as DataWorks Notebook.
Retains full debugging capabilities, including setting breakpoints, watching variables, and single-step execution.
The debugging experience is identical to native Python development.
Flexible data sources
Supports various data sources, including in-memory data, local files, and MaxCompute tables.
Data source type
Access method
Use case
In-memory data
md.DataFrame(pd.DataFrame())Quick logic validation
MaxCompute table
md.read_odps_table()Testing with real data
Local files
Native Pandas data interfaces such as
pd.read_csv()Offline development
Seamless transition to production
The debug code is identical to the production code and can be deployed directly to production after removing
debug=Trueordebug="local".# Debug environment session = new_session(o, debug=True) # Production environment session = new_session(o)
Quick start
Prerequisites
pip install --upgrade maxframe # MaxFrame SDK v2.5.0 or later is required.Basic example
from odps import ODPS from maxframe import new_session import maxframe.dataframe as md import pandas as pd # Initialize an ODPS object. o = ODPS( access_id='your_access_id', secret_access_key='your_secret_key', project='your_project', endpoint='your_endpoint' ) # Enable local debug mode. session = new_session(o, debug=True) # Prepare the data. df = md.DataFrame(pd.DataFrame({ "sales": [5000, 8000, 12000, 3000], "region": ["A", "B", "C", "D"] })) def calculate_commission(row): sales = row['sales'] if sales > 10000: # You can set a breakpoint here. rate = 0.15 print(rate) elif sales > 5000: # You can set a breakpoint here. rate = 0.10 print(rate) else: rate = 0.05 return sales * rate # Execute and fetch the result. result = df.apply(calculate_commission, axis=1).execute().fetch()
Considerations
Performance differences: The local debug mode is designed for development and validation. Its performance is not representative of the production environment.
Data volume limits: Use small datasets for debugging.
Dependency consistency: Ensure that the dependency versions in your local Python environment match those in the production environment.
Sensitive data: When debugging with a MaxCompute table, be mindful of data permissions and mask sensitive data as needed.