MaxCompute MapReduce jobs are subject to the following limits. Exceeding a limit causes the job to fail.
Limits fall into five categories:
-
Memory — the maximum memory a single map or reduce instance can use
-
Quantity — caps on instances, inputs, outputs, resources, counters, and retries
-
Length — size limits on resources, data splits, and string columns
-
Time — how long a worker can run without reading, writing, or sending a heartbeat
-
Data type — which field types are supported in table resources
Limits table
| Limit | Value | Category | Configuration parameter | Default | Configurable |
|---|---|---|---|---|---|
| Memory per instance | 256 MB–12 GB | Memory | odps.stage.mapper(reducer).mem and odps.stage.mapper(reducer).jvm.mem |
2,048 MB + 1,024 MB | Yes |
| Resources per job | 256 | Quantity | — | — | No |
| Inputs per job | 1,024 | Quantity | — | — | No |
| Outputs per job | 256 | Quantity | — | — | No |
| Distinct tables across all inputs | 64 | Quantity | — | — | No |
| Custom counters per job | 64 | Quantity | — | — | No |
| Map instances per job | 1–100,000 | Quantity | odps.stage.mapper.num |
Calculated from split size | Yes |
| Reduce instances per job | 0–2,000 | Quantity | odps.stage.reducer.num |
1/4 of map instances | Yes |
| Retries per failed instance | 3 | Quantity | — | — | No |
| Local debug: map instances | 2 (default), max 100 | Quantity | — | 2 | No |
| Local debug: reduce instances | 1 (default), max 100 | Quantity | — | 1 | No |
| Local debug: downloaded records per input | 100 (default), max 10,000 | Quantity | — | 100 | No |
| Repeated reads of one resource per instance | 64 | Quantity | — | — | No |
| Total resource size per job | 2 GB | Length | — | — | No |
| Split size | ≥ 1 | Length | odps.stage.mapper.split.size |
256 MB | Yes |
| STRING column content length | 8 MB | Length | — | — | No |
| Worker execution timeout | 1–3,600 seconds | Time | odps.function.timeout |
600 seconds | Yes |
| Supported field types in table resources | BIGINT, DOUBLE, STRING, DATETIME, BOOLEAN | Data type | — | — | No |
Limit details
Memory per instance
Each map or reduce instance has two memory pools: framework memory (default 2,048 MB) and Java Virtual Machine (JVM) heap memory (default 1,024 MB). The combined total must be within the 256 MB–12 GB range. Adjust both odps.stage.mapper.mem and odps.stage.mapper.jvm.mem (or their reducer equivalents) when tuning memory allocation.
Resources per job
A single job can reference up to 256 resources. Each table and each archive file counts as one resource. The total size of all referenced resources cannot exceed 2 GB.
Inputs and outputs
-
A single partition counts as one input. The total number of inputs cannot exceed 1,024.
-
The total number of distinct tables across all inputs cannot exceed 64.
-
The total number of outputs cannot exceed 256.
Custom counters
A job can define up to 64 custom counters. Both the Group Name and the Counter Name must exclude the number sign (#), and their combined length cannot exceed 100 characters.
Map and reduce instances
The framework calculates the number of map instances based on the split size. This is the default behavior — the framework determines the count automatically. To override this, set odps.stage.mapper.num to a value in the range 1–100,000. If no input table is specified, you must set this parameter explicitly.
By default, the number of reduce instances is one-fourth of the number of map instances. Set odps.stage.reducer.num to any value in the range 0–2,000 to override this. A reduce instance may process significantly more data than a map instance, which can slow down the reduce phase.
Retries
A failed map or reduce instance is retried up to three times. Some non-retriable exceptions cause the job to fail immediately without retries.
Local debug mode
In local debug mode:
-
Map instances: default 2, max 100
-
Reduce instances: default 1, max 100
-
Downloaded records per input: default 100, max 10,000
Worker execution timeout
A worker times out when it neither reads nor writes data nor sends a heartbeat via context.progress() for longer than the timeout period. The default timeout is 600 seconds (range: 1–3,600 seconds). Set odps.function.timeout to a higher value for jobs with long processing intervals between I/O operations.
Feature limits
| Feature | Supported |
|---|---|
| Reading data from OSS (Object Storage Service) | No |
| New data types introduced in MaxCompute V2.0 | No |
| Running MapReduce jobs in schema-enabled projects | No |
When a MapReduce task references a table resource, an error is reported if the table contains fields of unsupported data types.
If you upgrade a project to support schemas, existing MapReduce jobs in that project can no longer run.