Use the SET statement to configure MaxCompute session variables for the current session.
Syntax
SET <key>=<value>Parameters
| Parameter | Description |
|---|---|
key | The name of the property. |
value | The value to assign. |
Session properties
The following table lists the properties you can configure with the SET statement. Settings apply only to the current session.
To view the variables configured for the current session, run the SHOW statement. You can also find the values in the SourceXML of LogView 2.0.
Compute resources
| Property | Description | Valid values | Default |
|---|---|---|---|
odps.stage.mapper.mem | Memory allocated to each Map worker. | 256–12288 | 1024 MB |
odps.stage.reducer.mem | Memory allocated to each Reduce worker. | 256–12288 | 1024 MB |
odps.stage.joiner.mem | Memory allocated to each Join worker. | 256–12288 | 1024 MB |
odps.stage.mem | Total memory for all workers in a job. Lower priority than odps.stage.mapper.mem, odps.stage.reducer.mem, and odps.stage.joiner.mem. | 256–12288 | — |
odps.stage.mapper.split.size | Input data size per Map worker (split size of the input file). Adjusting this value indirectly controls the number of Map workers. | — | 256 MB |
odps.stage.reducer.num | Number of workers at each Reduce stage. | 1–10000 | — |
odps.stage.joiner.num | Number of workers at each Join stage. | 1–10000 | — |
odps.stage.num | Total number of concurrent workers in a job. Lower priority than odps.stage.mapper.split.size, odps.stage.reducer.num, and odps.stage.joiner.num. | 1–10000 | — |
When the LIMIT keyword is used in a SQL statement, concurrency is restricted to a single worker. If you configure any of the above worker-count or split-size properties, omit the LIMIT keyword from your SQL statements.
SQL engine behavior
| Property | Description | Valid values | Default |
|---|---|---|---|
odps.sql.reshuffle.dynamicpt | Enables dynamic partitioning, which prevents a large number of small files from being generated. Set to False when a small number of dynamic partitions are expected, to prevent data skew. | True / False | — |
odps.sql.type.system.odps2 | Enables the MaxCompute V2.0 data type edition. | True / False | — |
odps.sql.hive.compatible | Enables the Hive-compatible data type edition. Required for Hive syntaxes such as inputRecordReader, outputRecordReader, and Serde. | True / False | — |
odps.sql.executionengine.batch.rowcount | Number of rows the SQL engine processes per batch. Reduce this value if an out-of-memory (OOM) error occurs due to large row fields, or if UDF performance causes timeouts. Avoid setting this too low, as it degrades SQL engine performance. | 1–1024 | 1024 |
odps.sql.groupby.position.alias | Treats integer constants in a GROUP BY clause as column positions in a SELECT statement. | True / False | — |
odps.sql.groupby.skewindata | Enables the anti-skew feature for GROUP BY operations. | True / False | — |
odps.sql.orderby.position.alias | Treats integer constants in an ORDER BY clause as column positions in a SELECT statement. | True / False | — |
odps.sql.groupby.orderby.position.alias | Treats integer constants in both GROUP BY and ORDER BY clauses as column positions in a SELECT statement. Setting this to True while jobs are running may affect job parsing. Verify that existing jobs still work correctly before enabling this for a project. | True / False | — |
odps.sql.python.version | Python version used when executing SQL statements. | cp27 / cp37 | — |
odps.sql.timezone | Time zone of the MaxCompute project. For more information, see Time zone configuration operations. | — | — |
odps.sql.decimal.tostring.trimzero | Removes trailing zeros after the decimal point for DECIMAL type values. | True / False | True |
odps.sql.select.output.format | Controls whether the MaxCompute client (odpscmd) displays column headers in query results. For more information, see MaxCompute client (odpscmd). | {"needHeader":false,"fieldDelim":""} (no headers) / {"needHeader":true,"fieldDelim":""} (with headers) | — |
odps.sql.type.json.enable | Enables JSON data types. For more information, see Instructions for using JSON data types in MaxCompute. | True / False | — |
odps.sql.executionengine.enable.string.to.date.full.format | Converts date strings that include hour, minute, and second parts. | True / False | False |
odps.sql.executionengine.enable.rand.time.seed | Controls the seed used by the RAND function. True uses the current system time (results may differ on rerun; cannot be used as a shuffle key). False uses the current instance ID (ensures idempotency). | True / False | False |
odps.sql.metering.value.max | Upper limit on resources consumed by a SQL statement. For more information, see Consumption control. | — | — |
odps.sql.job.max.time.hours | Maximum duration a single job can run. Note To set this parameter at the project level, you can click the application link or search for the DingTalk group ID 11782920 to join the MaxCompute developer community DingTalk group and contact the MaxCompute technical support team to configure it in the background. Note To set this parameter at the project level, you can click the application link or search for the DingTalk group ID 11782920 to join the MaxCompute developer community DingTalk group and contact the MaxCompute technical support team to configure it in the background. | 1–72 | 24 hours |
MAPJOIN and data skew
| Property | Description | Valid values | Default |
|---|---|---|---|
odps.sql.mapjoin.memory.max | Memory size for small tables loaded into memory during a MAPJOIN operation. The global memory limit for all small tables in the same task equals min(8 GB, odps.sql.mapjoin.memory.max × number of small tables). | 0–8192 MB | — |
odps.sql.skewjoin | Enables the SKEWJOIN feature to mitigate long-tail issues. | True / False | — |
odps.sql.skewinfo | Specifies the key and value on which SKEWJOIN operates. For more information, see Data skew tuning. | — | — |
Optimizer
| Property | Description | Valid values | Default |
|---|---|---|---|
odps.optimizer.enable.range.partial.repartitioning | Enables the Shuffle Remove feature for range-clustered tables. | True / False | — |
odps.optimizer.skew.join.topk.num | Number of hot key values the optimizer retrieves during an Aggregate operation. For more information, see SKEWJOIN HINT. | — | — |
odps.optimizer.stat.collect.auto | Enables the Freeride feature, which automatically collects column statistics for tables. For more information, see Optimizer. | True / False | — |
odps.optimizer.stat.collect.plan | Collection plan for gathering column statistics on specific columns. For more information, see Optimizer. | — | — |
User-defined functions and types
| Property | Description | Valid values | Default |
|---|---|---|---|
odps.sql.udf.jvm.memory | Maximum Java Virtual Machine (JVM) heap memory for a user-defined function (UDF). Increase this value if UDF workloads cause OOM errors, but also optimize UDF code to address the root cause. | 256–12288 | 1024 MB |
odps.function.timeout | Timeout period for a UDF. | 0–3600 | 600 seconds |
odps.sql.udf.getjsonobj.new | Controls whether the GET_JSON_OBJECT function returns the original string as-is. Projects created after April 20, 2021 return original strings by default. Projects created before that date use escape characters for JSON-formatted reserved characters, to avoid affecting existing jobs. For more information, see GET_JSON_OBJECT. | True / False | — |
odps.sql.udf.ppr.deterministic | Enables partition pruning for UDFs. For more information, see the "WHERE clause (where_condition)" section in SELECT syntax. | True / False | — |
odps.sql.udf.ppr.to.subquery | Ignores errors during result backfilling when partition pruning is performed. For more information, see the "WHERE clause (where_condition)" section in SELECT syntax. | True / False | — |
odps.sql.session.resources | Resources referenced by user-defined types (UDTs). Separate multiple resources with commas. For more information, see Overview. | Uploaded resources | — |
odps.sql.udt.display.tostring | Enables the java.util.Objects.toString() mechanism on the Wrap for all columns whose output is a UDT. | True / False | — |
odps.sql.session.java.imports | Java packages referenced by UDTs. Separate multiple packages with commas. For more information, see Overview. | Uploaded Java packages | — |
Complex type write buffer
| Property | Description | Valid values | Default |
|---|---|---|---|
odps.sql.executionengine.coldata.deep.buffer.size.max | Cache size for storing columns with complex data types (such as STRING, MAP, STRUCT, ARRAY, or BINARY) during table writes. For a table with three complex-type columns, MaxCompute reserves 192 MB (64 × 3) of memory by default. If the output table contains variables of complex types, or if a small table involved in a MAPJOIN operation contains data of complex types, adjusting this property may adversely affect the memory used to run jobs. A large value of this property may cause an OOM error. If a large value causes OOM errors, reduce this property based on actual variable sizes and the batch row count setting. | Unit: bytes | 67108864 |
External tables
| Property | Description | Valid values | Default |
|---|---|---|---|
odps.sql.external.net.vpc | Enables virtual private cloud (VPC) for external tables. For more information, see Hologres external tables. | True / False | — |
odps.sql.unstructured.data.oss.use.https | Enables HTTPS when accessing Object Storage Service (OSS) external tables at the underlying layer. For more information, see Create an OSS external table. | True / False | — |
odps.sql.unstructured.oss.commit.mode | Enables multipart upload to write data to OSS external tables. For more information, see Write data to OSS. | True / False | — |
odps.sql.unstructured.tablestore.put.row | Calls the PutRow operation to write data to a Tablestore external table. For more information, see PutRow. | True / False | — |
odps.sql.unstructured.external.max.dop | Maximum concurrency of workers when accessing external tables. | Positive integer | — |
odps.ext.oss.orc.native | Uses the native ORC reader to read tables. | True / False | — |
InstanceTunnel
| Property | Description | Valid values | Default |
|---|---|---|---|
console.sql.result.instancetunnel | Enables InstanceTunnel. For more information, see Usage notes. | True / False | — |
Security and access control
| Property | Description | Valid values | Default |
|---|---|---|---|
LabelSecurity | Enables LabelSecurity. For more information, see Label-based access control. | True / False | — |
CheckPermissionUsingACL | Enables ACL-based access control. For more information, see MaxCompute permissions. | True / False | — |
CheckPermissionUsingPolicy | Enables policy-based access control. For more information, see Policy-based access control. | True / False | — |
ObjectCreatorHasAccessPermission | Grants object creators access to the objects they create. | True / False | — |
ObjectCreatorHasGrantPermission | Allows object creators to grant other users access to their objects. | True / False | — |
ProjectProtection | Enables project data protection. For more information, see Project data protection. | True / False | — |
odps.output.field.formatter | Specifies the dynamic data masking rule applied to SQL query results. For more information, see Dynamic data masking. | Custom masking rule | — |
odps.isolation.session.enable | Enables dynamic data masking for a single session. For more information, see Dynamic data masking. | True / False | — |
Quota
| Property | Description | Valid values | Default |
|---|---|---|---|
odps.task.quota.preference.tag | Specifies the quota group for a job. This corresponds to the quota group in MaxCompute Management. For a project that uses subscription resources, you can use this property to specify a secondary quota group for a job. If the quota tag set when you submit a job is the same as the quota tag in a quota group's properties, the job is preferentially scheduled to this quota group. Otherwise, the job is scheduled to the quota group specified for its project. You can run the following statement to set the parameter. Note This property currently supports only SQL jobs. | tag_name is the Quota Group Tag of a quota group in MaxCompute Management. The quota group must be in the same region as the project owner. The tag_name can contain only letters, digits, and underscores (_). Note You can use the odps.task.quota.preference.tag parameter only after you set a tag for the quota group in MaxCompute Management. | |
odps.task.quota.preference.tag | Specifies the quota group for a job. This corresponds to the quota group in MaxCompute Management. For a project that uses subscription resources, you can use this property to specify a secondary quota group for a job. If the quota tag set when you submit a job is the same as the quota tag in a quota group's properties, the job is preferentially scheduled to this quota group. Otherwise, the job is scheduled to the quota group specified for its project. You can run the following statement to set the parameter. Note This property currently supports only SQL jobs. | tag_name is the Quota Group Tag of a quota group in MaxCompute Management. The quota group must be in the same region as the project owner. The tag_name can contain only letters, digits, and underscores (_). Note You can use the odps.task.quota.preference.tag parameter only after you set a tag for the quota group in MaxCompute Management. | |
odps.task.wlm.quota | Name of the quota used to run tasks in the current session. For more information, see Manage quotas in the new MaxCompute console. | — | — |
Examples
Set the input data size per Map worker to 256 MB:
SET odps.stage.mapper.split.size=256;Set the JVM heap memory for UDFs to 2048 MB:
SET odps.sql.udf.jvm.memory=2048;Reduce the buffer size for complex-type columns to 1 MB (suitable when each variable is no larger than 1,024 bytes and batch row count is 1024):
SET odps.sql.executionengine.coldata.deep.buffer.size.max=1048576;What's next
SHOW: Display properties configured with the SET statement.