This topic describes how to use the SET command to set MaxCompute or user-defined system variables.
Description: This command is used to set MaxCompute system variables.
SETcommand only supports lower-case letters.
- For project-level settings, see SetProject.
- KEY: the attribute name that you want to change.
- VALUE: the attribute value. The following table describes the attributes and their values.
Attribute Description Value range odps.sql.allow.fullscan Specifies whether to allow a full scan on a partitioned table. Valid values: True (allow) and False (prohibit). odps.stage.mapper.mem Sets the memory size of each Map worker. Unit: MB. Default value: 1024 MB. odps.stage.reducer.mem Sets the memory size of each Reduce worker. Unit: MB. Default value: 1024 MB. odps.stage.joiner.mem Sets the memory size of each Join worker. Unit: MB. Default value: 1024 MB. odps.stage.mem Sets the total memory size of all workers in a specific MaxCompute job. This attribute has a lower priority than the preceding three attributes. Unit: MB. No default value. odps.stage.mapper.split.size Modifies the input data volume of each Map worker, which is the split size of the input file. You can use this attribute to indirectly control the number of workers in each Map stage. Unit: MB. Default value: 256 MB. odps.stage.reducer.num Modifies the number of workers in each Reduce stage. No default value. odps.stage.joiner.num Modifies the number of workers in each Join stage. No default value. odps.stage.num Modifies the worker concurrency at all stages in a specific MaxCompute job. This attribute has a lower priority than the preceding three attributes. No default value. odps.sql.reshuffle.dynamicpt Sets dynamic partitions to avoid the excessive generation of small files. Valid values: True and False. Default value: True.Note If a small number of dynamic partitions are generated, we recommend that you set this parameter to False to avoid data skew. odps.sql.type.system.odps2 Specifies whether to use new data types. You need to set it to True if new data types, such as TINYINT, SMALLINT, INT, FLOAT, VARCHAR, TIMESTAMP, and BINARY, are used in SQL operations. This setting is required for sessions. Valid values: True (enabled) and False (disabled). Default value: False. odps.sql.hive.compatible Specifies whether to enable the Hive compatibility mode. MaxCompute supports various syntax specified by Hive, such as inputRecordReader, outputRecordReader, and Serde only after the Hive compatibility mode is enabled. This setting is required for sessions. Valid values: True (enabled) and False (disabled). Default value: False. odps.sql.executionengine.coldata.deep.buffer.size.max Adjusts the cache size requested for storing a column that contains complex data to improve write performance.
- The output table contains too many complex data types.
- The size of a variable of a complex data type in the output table is too large.
- The default value is 67108864 bytes (64 MB). Unit: bytes.
- By default, MaxCompute reserves 192 MB memory (64 × 3) for table write operations
if the schemas of three columns in the output table are of complex data types, such
as STRING, MAP, STRUCT, ARRAY, and BINARY. The cache requested for each column is
used to store data in
batch row countthat corresponds to this column.
- If you know in advance that variables of complex data types in the table consume less
space, we recommend that you set this parameter to a smaller value. For example, if
the size of each variable of a complex data type does not exceed 1024 bytes and the
default value (1024) of
batch row countis used, you can set the value to 1048576 (1024 × 1024).
- If you know in advance that the size of each variable of a complex data type is between
7 MB and 8 MB, and
batch row countis set to 32, you can adjust this parameter to a value in bytes that equals to 256 MB (8 × 32).
- If the output of a task or the small table in the
MAPJOINoperation contains complex data types, then changes to the value affect the memory used during the job execution. Based on the preceding calculation methods, if the value is too large, out of memory (OOM) may occur during the job execution.
- Adjust the cache size requested for storing a column that contains complex data to
improve write performance.
- Set the size of data read by each Mapper to 256 MB.
Description: This command is used to display the parameters configured by using the SET command.
Note This command takes effect only at the project level. For more information about how to set project-level flags, see Project operations.