This page covers common errors and how-to questions for MaxFrame. If you're looking for a specific error code, use the quick-reference index below.
Error code index
Problem 1: Error "invalid type INT for function UDF definition, you need to set odps.sql.type.system.odps2=true; to use it"
MaxCompute V2.0 data types are not enabled by default. When your user-defined function (UDF) uses V2.0 types like INT, the job fails at execution time.
Add the following flag before calling new_session:
from maxframe import config
config.options.sql.settings = {
"odps.sql.type.system.odps2": "true"
}Problem 2: Error "UDF : No module named 'cloudpickle'"
The cloudpickle package is missing from the runtime environment. Reference the MaxCompute base image to include it:
from maxframe import config
config.options.sql.settings = {
"odps.session.image": "common",
}Problem 3: How to reuse resources in a UDF submitted by a DataFrame apply
When a UDF needs to initialize expensive resources — such as loading an ML model or creating a database connection — you want initialization to happen once per worker, not once per row.
Python initializes default parameter values only once per function definition. Storing shared state in a mutable default argument (like a dict) exploits this behavior so that initialization runs exactly once per UDF worker.
The following example loads a YOLO model only once per worker:
def predict(s, _ctx={}):
from ultralytics import YOLO
# _ctx is initialized as an empty dict once per worker process.
# On the first call, load the model and store it. Subsequent calls reuse it.
if not _ctx.get("model", None):
model = YOLO(os.path.join("./", "yolo11n.pt"))
_ctx["model"] = model
model = _ctx["model"]
# Call the model APIs here.For resources that require cleanup (such as database connections), use a custom class with __init__ and __del__:
class MyConnector:
def __init__(self):
# Open the connection when the object is created.
self.conn = create_connection()
def __del__(self):
# Close the connection when the object is garbage-collected.
try:
self.conn.close()
except:
pass
def process(s, connector=MyConnector()):
# The connector is shared across all calls within this worker.
# No need to open or close the connection inside the UDF.
connector.conn.execute("xxxxx")Initialization runs once per UDF worker, not once globally. If a UDF processes 100,000 rows across 10 workers, each worker handles 10,000 rows and runs initialization once — so initialization runs 10 times in total.
Problem 4: How to update the MaxFrame version in DataWorks resource groups (exclusive and general-purpose)
Content coming soon.
Problem 5: Best practices for using MaxFrame custom images
Content coming soon.
Problem 6: ODPS-0130071: Semantic analysis exception — sequence_row_id cannot be applied
Add index_col to your read_odps_table call. Without it, MaxFrame cannot assign row IDs, which causes the physical plan to fail.
df2 = md.read_odps_table("tablename", index_col="column").to_pandas()
df2.reset_index(inplace=True)Problem 7: Error "Cannot determine dtypes by calculating with enumerate data, please specify it as arguments" when using apply
MaxFrame infers the return type of your UDF to validate and build the output DataFrame or Series. This inference fails in two situations:
The UDF cannot run in the current environment — for example, it depends on a custom image, a third-party library that isn't installed, or requires input parameters that aren't available during inference.
The actual return type doesn't match the
output_typeyou specified.
Specify dtypes explicitly to tell MaxFrame what the UDF returns:
Return a DataFrame with one
intcolumn:df.apply(..., dtypes=pd.Series([np.int_]), output_type="dataframe")Return a DataFrame with columns
A(int) andB(str):df.apply(..., dtypes={"A": np.int_, "B": np.str_}, output_type="dataframe")Return a Series named
flagwith abooltype:df.apply(..., dtype="bool", name="flag", output_type="series")
Problem 8: How to add a flag the same way as in SQL
from maxframe import config
config.options.sql.settings = {
"odps.stage.mapper.split.size": "8", # Input split size for mappers, in MB
"odps.stage.joiner.num": "20" # Number of joiner instances
}Problem 9: How to reference third-party packages in MaxFrame development
See Reference third-party packages and images for full instructions.
To reference a MaxCompute resource in your UDF, use the @with_resources decorator:
from maxframe.udf import with_resources
@with_resources("resource_name")
def process(row):
...Problem 10: Task error "TypeError: Cannot accept arguments append_partitions"
Upgrade PyODPS to version 0.12.0 or later:
pip install --upgrade pyodpsProblem 11: How to parse many JSON string fields
MaxFrame SDK V1.0.0 and later supports parsing multiple JSON string fields using Series.mf.flatjson:
<https://maxframe.readthedocs.io/en/latest/reference/dataframe/generated/maxframe.dataframe.Series.mf.flatjson.html>
Problem 12: ODPS-0010000: Fuxi job failed — Job failed for unknown reason, cannot get jobstatus
This error usually means that dependency installation failed when using @with_python_requirements or similar methods. The PythonPack node that installs pip dependencies couldn't reach the dependency repository — often a transient network issue.
Check the stderr in the PythonPack Logview for details. A typical message looks like:
requests.exceptions.ConnectionError: HTTPConnectionPool(host='service.cn-beijing-intranet.maxcompute.aliyun-inc.com', port=80): Max retries exceeded with url: ...
(Caused by NameResolutionError("Failed to resolve 'service.cn-beijing-intranet.maxcompute.aliyun-inc.com'"))To fix this:
Retry the job. If the error was caused by a temporary network issue, retrying usually works. If it persists, contact the MaxFrame team.
Cache the packaging result for periodic jobs. Once PythonPack completes successfully, cache the result so subsequent daily jobs skip the install step:
from maxframe import options # Subsequent jobs reuse the cached PythonPack result instead of reinstalling. options.pythonpack.task.settings = {"odps.pythonpack.production": "true"}To force a rebuild and ignore the cache, add
force_rebuild=Truein@with_python_requirements.Package dependencies offline. Avoid PythonPack entirely by packaging dependencies offline with PyODPS-Pack, uploading them as a MaxFrame resource, and referencing them in the job. PyODPS-Pack builds packages in a manylinux Docker container to avoid compatibility issues. It runs on X86 Linux machines; Apple devices with M-series ARM chips are not supported. After uploading, reference the resource in your UDF with
@with_resources.
Problem 13: ODPS-0123055: User script exception
This is the most common MaxFrame error. It occurs when a UDF throws a Python exception during execution of operators like apply, apply_chunk, flatmap, map, or transform.
How to read the error
Check the stderr of the failed instance. The stack trace points directly to the line that failed. For example, calling json.loads on a non-JSON string produces this output — the message identifies simple_failure at line 5 as the source:
ODPS-0123055:User script exception - Traceback (most recent call last):
...
File "...", line 5, in simple_failure
File ".../json/__init__.py", line 346, in loads
return _default_decoder.decode(s)
json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0)If the UDF depends on a library that isn't installed in the runtime environment, it can't be deserialized. The error message names the missing module:
ModuleNotFoundError: No module named 'xxhash'Common causes and fixes
| Cause | Fix |
|---|---|
| Logic error in UDF code | Analyze the stack trace to find the failing function, then fix it |
Unhandled exception in try-except block | Make sure all exception types are handled |
| Network access attempted | Enable network access — see Network enablement process |
dtype or dtypes doesn't match the actual return type | Update dtype/dtypes to match what the UDF actually returns |
| Missing dependency in runtime environment | Install the dependency via PythonPack, or package it as a resource with @with_resources |
Debugging locally
Reproduce the error by constructing sample input and calling the function as a regular Python function:
def udf_func(row):
import json
text = row["json_text"]
data = json.loads(text)
return data
# Test locally with sample data
udf_func(pd.Series(['{"hello": "maxframe"}'], index=["json_text"]))Problem 14: ODPS-0123144: Fuxi job failed — kInstanceMonitorTimeout CRASH_EXIT
The UDF timed out. In MaxCompute offline computing, UDF execution is monitored by row batches — if a UDF doesn't finish processing a batch within the configured time limit, the job is terminated.
Adjust the batch size and timeout as needed:
from maxframe import options
options.sql.settings = {
# Number of rows per batch. Default: 1024. Minimum: 1.
# Reduce this if individual rows take a long time to process.
"odps.sql.executionengine.batch.rowcount": "1",
# Time limit per batch, in seconds. Default: 1800. Maximum: 3600.
"odps.function.timeout": "3600",
}Problem 15: ODPS-0123144: Fuxi job failed — Job exceed live limit
MaxCompute jobs have a maximum runtime of 24 hours by default. When a job exceeds this limit, the CDN mapping system marks it as Failed and terminates it.
If the job runs on DataWorks, a different timeout may apply and the job status shows as Canceled. Contact the DataWorks team for details.
Increase the session and job time limits before submitting long-running jobs:
from maxframe import options
# Extend the maximum session lifetime (in seconds)
options.session.max_alive_seconds = 72 * 60 * 60
# Extend the maximum session idle time (in seconds)
options.session.max_idle_seconds = 72 * 60 * 60
options.sql.settings = {
# Maximum SQL job runtime in hours. Default: 24. Maximum: 72.
"odps.sql.job.max.time.hours": 72,
}Problem 16: ODPS-0130071: Semantic analysis exception — unable to retrieve row count of file pangu://xxx
This error occurs when MaxCompute cannot read the row count metadata for a source table — typically because no meta file was generated when data was written to the table. Without this metadata, MaxCompute can't accurately split the table for distributed processing.
Option 1: Use odps.stage.mapper.split.size instead of odps.sql.split.dop. This flag controls split size in MB (default: 256, minimum: 1) and doesn't rely on row count metadata.
Option 2: If you need precise splitting, contact the MaxCompute team to regenerate the compact meta file (CMF).
To ensure meta files are generated in future write operations, add these flags:
from maxframe import options
options.sql.settings = {
"odps.task.merge.enabled": "false",
"odps.sql.reshuffle.dynamicpt": "false",
"odps.sql.enable.dynaparts.stats.collection": "true",
"odps.optimizer.dynamic.partition.is.first.nth.value.split.enable": "false",
"odps.sql.stats.collection.aggressive": "true",
}Problem 17: ODPS-0130071:[x,y] Semantic analysis exception
In a ReadOdpsQuery scenario, this error usually indicates a semantic problem in the SQL query itself.
Check the SQL syntax.
Upgrade the MaxFrame client:
pip install --upgrade maxframe.If the error persists, contact the MaxFrame team.
Problem 18: ODPS-0020041:StringOutOfMaxLength:String length X is larger than maximum Y
A string in your data exceeds MaxCompute's storage-layer limit of 268,435,456 characters. This can happen when writing to a table or during a shuffle operation.
Option 1: Filter or truncate the oversized data. In a ReadOdpsQuery, use the LENGTH function to filter rows before they hit the limit.
Option 2: Compress the data before storing it. gzip can significantly reduce string size:
import gzip
def compress_string(input_string):
"""Compresses a string using gzip."""
encoded_string = input_string.encode('utf-8')
compressed_bytes = gzip.compress(encoded_string)
return compressed_bytesOption 3: Contact the MaxCompute team for help with specific data.
Problem 19: ODPS-0010000:System internal error — fuxi job failed, caused by: process exited with code 0
A job containing a UDF or AI function failed due to an out-of-memory (OOM) error.
Contact the MaxCompute team to confirm the actual memory usage.
Run the UDF or AI function with more memory. For a UDF, use
@with_running_options:@with_running_options(memory="8GB") def udf_func(row): return rowFor an AI function, pass
running_options={"memory": "8GB"}in the function call.
Problem 20: ODPS-0123131:User defined function exception — internal error — Fatal Error Happended
This error occurs when reading from or writing to an external table. Contact the MaxCompute team.
Problem 21: ODPS-0010000:System internal error — MetastoreServerException: 0420111:Database not found
The schema, project, or table referenced in your SQL cannot be found in the metadata store.
Check that the project, schema, and table names in your SQL are correct. Fix any errors and retry.
If the information is correct and the error persists, contact the MaxCompute team.
Problem 22: ODPS-0010000:System internal error — fuxi job failed, caused by: process killed by signal 7
The UDF sent an abnormal signal during runtime.
Check whether the UDF sends any signals to the process (for example, for cancellation or timeout handling).
If no signals are sent from your code, contact the MaxCompute team for troubleshooting.
Problem 23: ODPS-0010000:System internal error — fuxi job failed, caused by: StdException:vector::_M_range_insert
The UDF couldn't allocate enough memory, causing a vector insertion to fail.
Check the UDF for memory issues. Verify that any native dependency libraries are up to date and don't have known memory bugs. Increase the memory allocated to the UDF.
If the issue persists, contact the MaxCompute team.
Problem 24: ODPS-0130071: Semantic analysis exception — task:M1 instance count exceeds limit 99999
By default, MaxCompute splits source tables into 256 MB chunks for distributed processing. If the total number of chunks exceeds 99,999, the job fails. This happens when the source table is very large or when a split flag is misconfigured.
Increase the split size with
odps.stage.mapper.split.size(unit: MB, default: 256, minimum: 1). A larger value reduces the number of chunks.Set a target chunk count with
odps.sql.split.dop(minimum: 1).If neither approach works after multiple adjustments, contact the MaxCompute team. Due to internal constraints, the final chunk count may differ from the target — setting the target close to the 99,999 limit may still trigger the error.
Problem 25: ODPS-0110061:Failed to run ddltask — ODPS-0130131:Table not found
This error appears in long-running MaxFrame jobs (more than 24 hours). An internal data definition language (DDL) task fails because a temporary table created earlier in the session has expired.
Temporary tables created during computation (for example, after a df.execute() call) have a default time-to-live (TTL) of one day. Sink tables specified with to_odps_table are not affected.
Increase the TTL for temporary tables to cover the expected job duration:
options.sql.settings = {
# TTL in days. Set this to the maximum number of days your job may run.
"session.temp_table_lifecycle": 3,
}Problem 26: NoTaskServerResponseError
The MaxFrame session expired. By default, a session expires after 1 hour of inactivity. If you pause in a Jupyter Notebook for more than 1 hour before running the next cell, the session is gone.
If you've already hit this error: Recreate the session. Computation state from previous cells is not preserved.
To avoid this error on future runs: Extend the idle timeout before starting the session:
from maxframe import options
# Set the session idle timeout to 24 hours (default is 1 hour).
options.session.max_idle_seconds = 60 * 60 * 24Problem 27: IntCastingNaNError: Cannot convert non-finite values (NA or inf) to integer: Error while type casting for column 'xx'
This error appears when printing a DataFrame that contains a BIGINT or INT column with NULL or INF values — including automatic display in a Jupyter Notebook cell. The underlying issue is that pandas cannot represent NULL in integer columns; NULL values are internally stored as FLOAT, and casting them back to integer fails.
The MaxFrame team is working on a long-term fix. For now, use one of these workarounds:
Fill
NULLvalues before printing:df["col"].fillna(0)Convert the column to float before printing:
df["col"].astype(float)Skip printing the column unless necessary.
Problem 28: ODPS-0010000:System internal error — fuxi job failed, SQL job failed after failover for too many times
Shuffle data is too large, causing the Job Master to run out of memory (OOM). This typically happens when a job with Reduce or Join operations generates too many mapper or reducer/joiner instances.
Common triggers:
A very small
split.sizeor very largesplit.dopvalue that creates too many mapper instancesA large
reducer.numorjoiner.numvalue that creates too many reducer or joiner instances
Reduce the number of mappers and reducers/joiners. The combined total should not exceed 10,000. If the error persists, contact the MaxCompute team.
Problem 29: ODPS-0010000:System internal error — Total resource size must be <= 2048MB
A UDF depends on resources whose combined size exceeds the 2048 MB limit.
Use external volume acceleration to download the resources from Object Storage Service (OSS) at runtime instead. This approach avoids the 2048 MB constraint and provides faster download speeds.
Problem 30: ODPS-0130071: Semantic analysis exception — column values_list in source has incompatible type ARRAY/MAP/STRUCT
The data contains arrays, maps, or structs, and the type declaration doesn't match.
Upgrade the MaxFrame client and retry:
pip install -U maxframe.If the error persists, contact the MaxFrame team. This may be a bug in the MaxFrame type system.
Problem 31: Shuffle output too large
Use the odps.sql.sys.flag.fuxi_JobMaxInternalFolderSize flag to specify the maximum shuffle space in MB.