Execute the SET statement to configure MaxCompute system variables for the current session - MaxCompute

You can execute the SET statement to configure MaxCompute system variables for the current session.

Syntax

SET <KEY>=<VALUE>

Parameters

KEY: the name of the property.

VALUE: the value of the property.

The following table describes common properties at the session level.

Property	Description	Valid values
console.sql.result.instancetunnel	Specifies whether to enable InstanceTunnel. For more information, see Usage notes.	True: enabled. False: disabled.
odps.stage.mapper.mem	The memory size of each Map worker.	Valid values: 256 to 12288. Default value: 1024. Unit: MB.
odps.stage.reducer.mem	The memory size of each Reduce worker.	Valid values: 256 to 12288. Default value: 1024. Unit: MB.
odps.stage.joiner.mem	The memory size of each Join worker.	Valid values: 256 to 12288. Default value: 1024. Unit: MB.
odps.stage.mem	The total memory size of all workers in a specified MaxCompute job. This property has a lower priority than the odps.stage.mapper.mem, odps.stage.reducer.mem, and odps.stage.joiner.mem properties.	Valid values: 256 to 12288. No default value. Unit: MB.
odps.stage.mapper.split.size	The input data amount of each Map worker, which is the split size of the input file. You can use this property to indirectly control the number of workers at each Map stage. Note If the LIMIT keyword is used in SQL statements, the number of concurrent workers is limited to 1. This way, only a single worker is allowed to run at a time. In this case, when you configure this property, we recommend that you do not add the LIMIT keyword to SQL statements.	Default value: 256. Unit: MB.
odps.stage.reducer.num	The number of workers at each Reduce stage. Note If the LIMIT keyword is used in SQL statements, the number of concurrent workers is limited to 1. This way, only a single worker is allowed to run at a time. In this case, when you configure this property, we recommend that you do not add the LIMIT keyword to SQL statements.	Valid values: 1 to 10000.
odps.stage.joiner.num	The number of workers at each Join stage. Note If the LIMIT keyword is used in SQL statements, the number of concurrent workers is limited to 1. This way, only a single worker is allowed to run at a time. In this case, when you configure this property, we recommend that you do not add the LIMIT keyword to SQL statements.	Valid values: 1 to 10000.
odps.stage.num	The total number of concurrent workers in a specified MaxCompute job. This parameter has a lower priority than the odps.stage.mapper.split.size, odps.stage.reducer.num, and odps.stage.joiner.num parameters. Note If the LIMIT keyword is used in SQL statements, the number of concurrent workers is limited to 1. This way, only a single worker is allowed to run at a time. In this case, when you configure this property, we recommend that you do not add the LIMIT keyword to SQL statements.	Valid values: 1 to 10000.
odps.sql.reshuffle.dynamicpt	Specifies whether to enable dynamic partitioning. Dynamic partitioning prevents the generation of a large number of small files.	True: enabled. False: disabled. Note If a small number of dynamic partitions are generated, we recommend that you set this property to False to prevent data skew issues.
odps.sql.type.system.odps2	Specifies whether to enable the MaxCompute V2.0 data type edition. For more information about the MaxCompute V2.0 data type edition, see MaxCompute V2.0 data type edition.	True: enabled. False: disabled.
odps.sql.hive.compatible	Specifies whether to enable the Hive-compatible data type edition. MaxCompute supports Hive syntaxes, such as `inputRecordReader`, `outputRecordReader`, and `Serde`, only after the Hive-compatible data type edition is enabled. For more information about the Hive-compatible data type edition, see Hive-compatible data type edition.	True: enabled. False: disabled.
odps.sql.executionengine.coldata.deep.buffer.size.max	The cache size requested to store a column that contains data of complex types when MaxCompute writes data to a table. This improves the write performance. You can configure this property if the output table contains excessive data of complex types or the size of a variable of a specified complex type in the output table is too large. If the schemas of three columns in the output table are of complex data types, such as STRING, MAP, STRUCT, ARRAY, or BINARY, MaxCompute automatically reserves 192 MB (64 × 3) of memory for table write operations. The cache requested for each column is used to store data in the row that is specified by `batch row count`. If the estimated size of the variables of complex types in the table is small, we recommend that you set this property to a small value. For example, if the size of each variable of a specific complex type does not exceed 1,024 bytes and the default value (1024) of `batch row count` is used, you can set the value of this property to 1048576 (1024 × 1024). Sample statement: `set odps.sql.executionengine.coldata.deep.buffer.size.max=1048576;` If the estimated size of each variable of a complex type is between 7 MB and 8 MB, and `batch row count` is set to 32, you can adjust the value of this property to 256 MB (8 × 32). If the output table contains variables of complex types or the small table involved in the `MAPJOIN` operation contains data of complex types, the memory used to run jobs may be adversely affected when you adjust the value of this property. Based on the preceding calculation method, a large value of this property may cause an out of memory (OOM) error.	Unit: bytes. Default value: 67108864.
odps.task.quota.preference.tag	The quota group that is allocated to a specified job. You can configure the quota group on MaxCompute Management. For projects that use subscription resources, you can configure this property to specify a level-2 quota group that can be used by a job. A job is preferentially scheduled to a quota group whose quota tag is the same as the quota tag of the job. You must configure a quota tag for a job when you submit the job. If no quota group has the same quota tag, the job is scheduled to the quota group specified by the project to which the job belongs. You can execute the following statement to configure a quota tag: `set odps.task.quota.preference.tag = tag_name;` Note This property is available only for SQL jobs.	tag_name specifies the tag of the quota group configured in MaxCompute Management. You can configure tag_name only for quota groups in the region where the project of your job belongs. The value of the tag_name parameter can contain letters, digits, and underscores (_). Note You can specify the odps.task.quota.preference.tag property only after you configure tags for quota groups in MaxCompute Management.
odps.sql.udf.getjsonobj.new	Specifies whether the GET_JSON_OBJECT function retains the original string when it returns a value. From April 20, 2021, the GET_JSON_OBJECT function automatically retains the original strings for the returned results in new MaxCompute projects. For MaxCompute projects created before April 20, 2021, the GET_JSON_OBJECT function automatically returns JSON-formatted reserved characters by using escape characters. This prevents the impact on existing jobs. For more information about the GET_JSON_OBJECT function, see GET_JSON_OBJECT.	True: enabled. False: disabled.
odps.sql.udf.jvm.memory	The maximum memory size of the Java Virtual Machine (JVM) heap for a user-defined function (UDF). If a large amount of data is calculated and sorted in the memory by using a UDF, the OOM error may occur. In this case, you can increase the value of this property. However, you must optimize UDF code based on your business requirements to prevent OOM errors.	Valid values: 256 to 12288. Default value: 1024. Unit: MB.
odps.function.timeout	The timeout period of a UDF.	Valid values: 0 to 3600. Default value: 600. Unit: seconds.
LabelSecurity	Specifies whether to enable LabelSecurity. For more information about LabelSecurity, see Label-based access control.	True: enabled. False: disabled.
odps.sql.session.resources	The resources that are referenced by user-defined types (UDTs). You can specify multiple resources and separate them with commas (,). For more information about how to reference resources, see Overview.	The uploaded resources.
odps.sql.udt.display.tostring	Specifies whether to enable the `java.util.Objects.toString(...)` mechanism on the Wrap for all the columns whose outputs are UDTs.	True: enabled. False: disabled.
odps.sql.session.java.imports	The Java packages that are referenced by UDTs. You can specify multiple Java packages and separate them with commas (,). For more information about how to reference Java packages, see Overview.	The uploaded Java packages.
CheckPermissionUsingACL	Specifies whether to enable ACL-based access control. For more information about ACL-based access control, see MaxCompute permissions.	True: enabled. False: disabled.
CheckPermissionUsingPolicy	Specifies whether to enable policy-based access control. For more information about policy-based access control, see Policy-based access control.	True: enabled. False: disabled.
ObjectCreatorHasAccessPermission	Specifies whether creators of objects can perform operations on the objects created by themselves.	True: Creators of objects can perform operations on the objects that are created by themselves. False: Creators of objects cannot perform operations on the objects created by themselves.
ObjectCreatorHasGrantPermission	Specifies whether creators of objects can authorize other users to perform operations on the objects.	True: Creators of objects can authorize other users to perform operations on the objects. False: Creators of objects cannot authorize other users to perform operations on the objects.
ProjectProtection	Specifies whether to enable project data protection. For more information about the project data protection mechanism, see Project data protection.	True: enabled. False: disabled.
odps.output.field.formatter	The dynamic data masking rule for SQL query results. For more information about dynamic data masking, see Dynamic data masking.	The custom dynamic data masking rule.
odps.isolation.session.enable	Specifies whether to enable dynamic data masking for a single session. For more information about dynamic data masking, see Dynamic data masking.	True: enabled. False: disabled.
odps.sql.skewjoin	Specifies whether to enable the SKEWJOIN feature to prevent long tails.	True: enabled. False: disabled.
odps.sql.skewinfo	The key and value on which SKEWJOIN is executed. For more information, see Data skew tuning.	-
odps.sql.reducer.instances	The number of hash buckets. For more information about hash buckets, see Table operations.	Valid values: 0 to 4000.
odps.sql.udf.ppr.deterministic	Specifies whether to enable partition pruning for UDFs. For more information about partition pruning, see the "WHERE clause (where_condition)" section in SELECT syntax.	True: enabled. False: disabled.
odps.sql.udf.ppr.to.subquery	Specifies whether to ignore the error that occurs during result backfilling when partition pruning is being performed. For more information about partition pruning, see the "WHERE clause (where_condition)" section in SELECT syntax.	True: enabled. False: disabled.
odps.optimizer.enable.range.partial.repartitioning	Specifies whether to enable the Shuffle Remove feature for range-clustered tables.	True: enabled. False: disabled.
odps.optimizer.skew.join.topk.num	The number of hot key values obtained by the optimizer when it performs an Aggregate operation. For more information, see SKEWJOIN HINT.	-
odps.optimizer.stat.collect.auto	Specifies whether to enable the Freeride feature. After the Freeride feature is enabled, the column stats metrics of tables are automatically collected. For more information, see Optimizer.	True: enabled. False: disabled.
odps.optimizer.stat.collect.plan	The collection plan that is used to collect specific column stats metrics of specific columns. For more information, see Optimizer.	-
odps.sql.executionengine.batch.rowcount	The number of data rows that can be processed by the SQL engine at a time. Default value: 1024. If an OOM error occurs because of a large field in a row or if a timeout occurs due to poor UDF performance, you must set this property to a smaller value. We recommend that you do not set this property to an excessively small value because an excessively small value may adversely affect the performance of the SQL engine.	Valid values: 1 to 1024.
odps.sql.external.net.vpc	Specifies whether to enable virtual private cloud (VPC) for external tables. For more information, see Hologres external tables.	True: enabled. False: disabled.
odps.sql.groupby.position.alias	Specifies whether to process integer constants in the `GROUP BY` clause as column numbers in a SELECT statement.	True: enabled. False: disabled.
odps.sql.groupby.skewindata	Specifies whether to enable the anti-skew feature for the `GROUP BY` clause.	True: enabled. False: disabled.
odps.sql.orderby.position.alias	Specifies whether to process integer constants in the `ORDER BY` clause as column numbers in a SELECT statement.	True: enabled. False: disabled.
odps.sql.groupby.orderby.position.alias	Specifies whether to process integer constants in the `GROUP BY` and `ORDER BY` clauses as column numbers in a SELECT statement. We recommend that you specify this property. Note If you set this property to True for projects in which jobs are running, the job parsing and running may be adversely affected. Therefore, you must make sure that existing jobs can still run properly based on the original logic when you set the property to True. Otherwise, specify this property for sessions.	True: enabled. False: disabled.
odps.sql.mapjoin.memory.max	The size of small tables that are read to the memory when the MAPJOIN statement is executed. Unit: MB.	Valid values: 0 to 8192.
odps.sql.metering.value.max	The upper limit on resources consumed by an SQL statement. For more information, see Consumption control.	-
odps.sql.python.version	The Python version on which SQL statements are executed.	cp27 cp37
odps.sql.select.output.format	Specifies whether to display the table headers in the returned results of the MaxCompute client. For more information, see MaxCompute client (odpscmd).	`{""needHeader"":false,""fieldDelim"":""""}`: The table headers are not displayed. `{""needHeader"":true,""fieldDelim"":""""}`: The table headers are displayed.
odps.sql.timezone	The time zone of the MaxCompute project that you accessed. For more information about time zones, see Time zone configuration operations.	-
odps.sql.unstructured.data.oss.use.https	Specifies whether to enable the mechanism of obtaining data by using HTTPS at the underlying layer when you access external tables. For more information, see Create an OSS external table.	True: enabled. False: disabled.
odps.sql.unstructured.oss.commit.mode	Specifies whether to enable the multipart upload feature of Object Storage Service (OSS) to write data to OSS external tables. For more information, see Write data to OSS.	True: enabled. False: disabled.
odps.sql.decimal.tostring.trimzero	Specifies whether to remove the zero at the end of the decimal point when the data type is DECIMAL. The default value is `True`, which indicates to remove the zero at the end of the decimal point.	True: The zero at the end of the decimal point is removed. False: The zero at the end of the decimal point is retained.
odps.sql.unstructured.tablestore.put.row	Specifies whether to call the PutRow operation to write data to a Tablestore external table. Note For more information about PutRow, see PutRow.	True: enabled. False: disabled.
odps.sql.unstructured.external.max.dop	The maximum concurrency of workers that are accessed from an external tables.	The value of this property must be a positive integer.
odps.sql.executionengine.enable.string.to.date.full.format	Specifies whether to convert a date string that contains the hour, minute, and second parts.	True: enabled. False: enabled. Default value: False.
odps.sql.executionengine.enable.rand.time.seed	Specifies whether the RAND function uses the current system time or the current instance ID as the seed to generate random numbers.	True: The RAND function uses the current system time as the seed to generate random numbers. However, the RAND function no longer ensures idempotence and cannot be used as the shuffle key. As a result, rerun results may be different. False: The RAND function uses the current instance ID as the seed to generate random numbers. The RAND function can ensure idempotence. Default value: False.
odps.sql.type.json.enable	Specifies whether to enable JSON data types. For more information about JSON data types, see Instructions for using JSON data types in MaxCompute.	True: enabled. False: disabled.
odps.ext.oss.orc.native	Specifies whether to use the native ORC reader to read tables.	True: The native ORC reader is used. False: The native ORC reader is not used.
odps.sql.job.max.time.hours	The maximum duration in which a single job runs. Note If you need to configure the property at the project level, you can click the link or search for the DingTalk group (ID: 11782920) to contact MaxCompute technical support.	Valid values: 24 to 72. Unit: hours. Default value: 24.
odps.task.wlm.quota	The name of the quota that is used to run tasks in the current session. For more information about quotas, see Manage quotas in the new MaxCompute console.	-

Note

If you need to query the MaxCompute system variables that are configured for the current session, you can view them in the SourceXML of LogView. For more information, see Use LogView 2.0 to view job running information.

Examples

-- Set the size of data read by each Mapper to 256 MB. 
SET odps.stage.mapper.split.size=256;

Related statements

SHOW: This statement is used to display the properties that you configured by using the SET statement.