Specifications and limits of the MapReduce engine - MaxCompute

MaxCompute MapReduce jobs are subject to the following limits. Exceeding a limit causes the job to fail.

Limits fall into five categories:

Memory — the maximum memory a single map or reduce instance can use
Quantity — caps on instances, inputs, outputs, resources, counters, and retries
Length — size limits on resources, data splits, and string columns
Time — how long a worker can run without reading, writing, or sending a heartbeat
Data type — which field types are supported in table resources

Limits table

Limit	Value	Category	Configuration parameter	Default	Configurable
Memory per instance	256 MB–12 GB	Memory	`odps.stage.mapper(reducer).mem` and `odps.stage.mapper(reducer).jvm.mem`	2,048 MB + 1,024 MB	Yes
Resources per job	256	Quantity	—	—	No
Inputs per job	1,024	Quantity	—	—	No
Outputs per job	256	Quantity	—	—	No
Distinct tables across all inputs	64	Quantity	—	—	No
Custom counters per job	64	Quantity	—	—	No
Map instances per job	1–100,000	Quantity	`odps.stage.mapper.num`	Calculated from split size	Yes
Reduce instances per job	0–2,000	Quantity	`odps.stage.reducer.num`	1/4 of map instances	Yes
Retries per failed instance	3	Quantity	—	—	No
Local debug: map instances	2 (default), max 100	Quantity	—	2	No
Local debug: reduce instances	1 (default), max 100	Quantity	—	1	No
Local debug: downloaded records per input	100 (default), max 10,000	Quantity	—	100	No
Repeated reads of one resource per instance	64	Quantity	—	—	No
Total resource size per job	2 GB	Length	—	—	No
Split size	≥ 1	Length	`odps.stage.mapper.split.size`	256 MB	Yes
STRING column content length	8 MB	Length	—	—	No
Worker execution timeout	1–3,600 seconds	Time	`odps.function.timeout`	600 seconds	Yes
Supported field types in table resources	BIGINT, DOUBLE, STRING, DATETIME, BOOLEAN	Data type	—	—	No

Limit details

Memory per instance

Each map or reduce instance has two memory pools: framework memory (default 2,048 MB) and Java Virtual Machine (JVM) heap memory (default 1,024 MB). The combined total must be within the 256 MB–12 GB range. Adjust both odps.stage.mapper.mem and odps.stage.mapper.jvm.mem (or their reducer equivalents) when tuning memory allocation.

Resources per job

A single job can reference up to 256 resources. Each table and each archive file counts as one resource. The total size of all referenced resources cannot exceed 2 GB.

Inputs and outputs

A single partition counts as one input. The total number of inputs cannot exceed 1,024.
The total number of distinct tables across all inputs cannot exceed 64.
The total number of outputs cannot exceed 256.

Custom counters

A job can define up to 64 custom counters. Both the Group Name and the Counter Name must exclude the number sign (#), and their combined length cannot exceed 100 characters.

Map and reduce instances

The framework calculates the number of map instances based on the split size. This is the default behavior — the framework determines the count automatically. To override this, set odps.stage.mapper.num to a value in the range 1–100,000. If no input table is specified, you must set this parameter explicitly.

By default, the number of reduce instances is one-fourth of the number of map instances. Set odps.stage.reducer.num to any value in the range 0–2,000 to override this. A reduce instance may process significantly more data than a map instance, which can slow down the reduce phase.

Retries

A failed map or reduce instance is retried up to three times. Some non-retriable exceptions cause the job to fail immediately without retries.

Local debug mode

In local debug mode:

Map instances: default 2, max 100
Reduce instances: default 1, max 100
Downloaded records per input: default 100, max 10,000

Worker execution timeout

A worker times out when it neither reads nor writes data nor sends a heartbeat via context.progress() for longer than the timeout period. The default timeout is 600 seconds (range: 1–3,600 seconds). Set odps.function.timeout to a higher value for jobs with long processing intervals between I/O operations.

Feature limits

Feature	Supported
Reading data from OSS (Object Storage Service)	No
New data types introduced in MaxCompute V2.0	No
Running MapReduce jobs in schema-enabled projects	No

When a MapReduce task references a table resource, an error is reported if the table contains fields of unsupported data types.

Note

If you upgrade a project to support schemas, existing MapReduce jobs in that project can no longer run.