All Products
Search
Document Center

MaxCompute:SET

Last Updated:Aug 28, 2023

You can execute the SET statement to set MaxCompute system variables for the current session.

Syntax

set <KEY>=<VALUE>

Parameters

  • KEY: the name of the property.

  • VALUE: the value of the property.

    The following table describes common properties at the session level.

    Property

    Description

    Valid value

    console.sql.result.instancetunnel

    Specifies whether to enable InstanceTunnel. For more information about Tunnel commands, see Usage notes.

    • True: InstanceTunnel is enabled.

    • False: InstanceTunnel is disabled.

    odps.stage.mapper.mem

    Specifies the memory size of each Map worker.

    Valid values: 256 to 12288. Default value: 1024. Unit: MB.

    odps.stage.reducer.mem

    Specifies the memory size of each Reduce worker.

    Valid values: 256 to 12288. Default value: 1024. Unit: MB.

    odps.stage.joiner.mem

    Specifies the memory size of each Join worker.

    Valid values: 256 to 12288. Default value: 1024. Unit: MB.

    odps.stage.mem

    Specifies the total memory size of all workers in a specified MaxCompute job. This property has a lower priority than the odps.stage.mapper.mem, odps.stage.reducer.mem, and odps.stage.joiner.mem properties.

    Valid values: 256 to 12288. Unit: MB No default value.

    odps.stage.mapper.split.size

    Specifies the input data amount of each Map worker, which is the split size of the input file. You can use this property to indirectly control the number of workers at each Map stage.

    Note

    If the LIMIT keyword is used in SQL statements, the number of concurrent workers is limited to 1. This way, only a single worker is allowed to run at a time. In this case, when you configure this parameter, we recommend that you do not add the LIMIT keyword to SQL statements.

    Default value: 256. Unit: MB.

    odps.stage.reducer.num

    Specifies the number of workers at each Reduce stage.

    Note

    If the LIMIT keyword is used in SQL statements, the number of concurrent workers is limited to 1. This way, only a single worker is allowed to run at a time. In this case, when you configure this parameter, we recommend that you do not add the LIMIT keyword to SQL statements.

    No default value.

    odps.stage.joiner.num

    Specifies the number of workers at each Join stage.

    Note

    If the LIMIT keyword is used in SQL statements, the number of concurrent workers is limited to 1. This way, only a single worker is allowed to run at a time. In this case, when you configure this parameter, we recommend that you do not add the LIMIT keyword to SQL statements.

    No default value.

    odps.stage.num

    Specifies the total number of concurrent workers in a specified MaxCompute job. This property has a lower priority than the odps.stage.mapper.split.size, odps.stage.reducer.mem, and odps.stage.joiner.num properties.

    Note

    If the LIMIT keyword is used in SQL statements, the number of concurrent workers is limited to 1. This way, only a single worker is allowed to run at a time. In this case, when you configure this parameter, we recommend that you do not add the LIMIT keyword to SQL statements.

    No default value.

    odps.sql.reshuffle.dynamicpt

    Specifies whether to enable dynamic partitioning. Dynamic partitioning prevents the generation of a large number of small files.

    • True: Dynamic partitioning is enabled.

    • False: Dynamic partitioning is disabled.

    Note

    If a small number of dynamic partitions are generated, we recommend that you set this parameter to False to prevent data skew.

    odps.sql.type.system.odps2

    Specifies whether to enable the MaxCompute V2.0 data type edition. For more information about the MaxCompute V2.0 data type edition, see MaxCompute V2.0 data type edition.

    • True: The MaxCompute V2.0 data type edition is enabled.

    • False: The MaxCompute V2.0 data type edition is disabled.

    odps.sql.hive.compatible

    Specifies whether to enable the Hive-compatible data type edition. MaxCompute supports Hive syntax, such as inputRecordReader, outputRecordReader, and Serde, only after the Hive-compatible data type edition is enabled. For more information about the Hive-compatible data type edition, see Hive-compatible data type edition.

    • True: The Hive-compatible data type edition is enabled.

    • False: The Hive-compatible data type edition is disabled.

    odps.sql.executionengine.coldata.deep.buffer.size.max

    Specifies the cache size requested to store a column that contains data of complex types when MaxCompute writes data to a table. This improves the write performance.

    You can set this property if the output table contains excessive complex data types or the size of a variable of a specified complex data type in the output table is too large.

    • If the schemas of three columns in the output table are of complex data types, such as STRING, MAP, STRUCT, ARRAY, or BINARY, MaxCompute automatically reserves 192 MB (64 × 3) of memory for table write operations. The cache requested for each column is used to store data in the row that is specified by batch row count.

    • If the estimated size of the variables of complex data types in the table is small, we recommend that you set this parameter to a small value. For example, if the size of each variable of a specific complex data type does not exceed 1,024 bytes and the default value (1024) of batch row count is used, you can set the value of this property to 1048576 (1024 × 1024). Sample statement:

      set odps.sql.executionengine.coldata.deep.buffer.size.max=1048576;
    • If the estimated size of each variable of a complex data type is between 7 MB and 8 MB, and batch row count is set to 32, you can adjust the value of this property to 256 MB (8 × 32).

    • If the output table contains variables of complex data types or the small table involved in the MAPJOIN operation contains data of complex types, the memory used to run jobs may be affected when you adjust the value of this property. Based on the preceding calculation method, a large value of this property may cause an out of memory (OOM) error.

    Unit: bytes. Default value: 67108864.

    odps.sql.udf.getjsonobj.new

    Specifies whether the GET_JSON_OBJECT function retains the original string when it returns a value.

    From April 20, 2021, the GET_JSON_OBJECT function automatically retains the original strings for the returned results in new MaxCompute projects. For MaxCompute projects created before April 20, 2021, the GET_JSON_OBJECT function automatically returns JSON-formatted reserved characters by using escape characters. This prevents the impact on existing jobs.

    For more information about the GET_JSON_OBJECT function, see GET_JSON_OBJECT.

    • True: The original string is retained.

    • False: The original string is not retained.

    odps.sql.udf.jvm.memory

    Specifies the maximum memory size of the Java Virtual Machine (JVM) heap for a user-defined function (UDF).

    If a large amount of data is calculated and sorted in the memory by using a UDF, the OOM error may occur. In this case, you can increase the value of this property. However, you must optimize UDF code based on your business requirements to prevent OOM errors.

    Valid values: 256 to 12288. Default value: 1024. Unit: MB.

    odps.sql.udf.timeout

    Specifies the timeout period of UDFs.

    Valid values: 0 to 3600. Default value: 600. Unit: seconds.

    LabelSecurity

    Specifies whether to enable LabelSecurity. For more information about LabelSecurity, see Label-based access control.

    • True: LabelSecurity is enabled.

    • False: LabelSecurity is disabled.

    odps.sql.session.resources

    Specifies resources that are referenced by user-defined types (UDTs). You can specify multiple resources and separate them with commas (,). For more information about UDTs, see Overview.

    The uploaded resources.

    odps.sql.udt.display.tostring

    Specifies whether to enable the java.util.Objects.toString(...) mechanism on the Wrap for all the columns whose outputs are UDTs.

    • True: The java.util.Objects.toString(...) mechanism is enabled.

    • False: The java.util.Objects.toString(...) mechanism is disabled.

    odps.sql.session.java.imports

    Specifies Java packages that are referenced by UDTs. You can specify multiple Java packages and separate them with commas (,). For more information about how to reference Java packages, see Overview.

    The uploaded Java package.

    CheckPermissionUsingACL

    Specifies whether to enable ACL-based access control. For more information about ACL-based access control, see MaxCompute permissions.

    • True: ACL-based access control is enabled.

    • False: ACL-based access control is disabled.

    CheckPermissionUsingPolicy

    Specifies whether to enable policy-based access control. For more information about policy-based access control, see Policy-based access control.

    • True: Policy-based access control is enabled.

    • False: Policy-based access control is disabled.

    ObjectCreatorHasAccessPermission

    Specifies whether creators of objects can perform operations on the objects created by themselves.

    • True: By default, creators of objects can perform operations on the objects that are created by themselves.

    • False: Creators of objects cannot perform operations on the objects created by themselves.

    ObjectCreatorHasGrantPermission

    Specifies whether creators of objects can authorize other users to perform operations on the objects.

    • True: By default, creators of objects can authorize other users to perform operations on the objects.

    • False: Creators of objects cannot authorize other users to perform operations on the objects.

    ProjectProtection

    Specifies whether to enable project data protection. For more information about project data protection, see Project data protection.

    • True: Project data protection is enabled.

    • False: Project data protection is disabled.

    odps.sql.skewjoin

    Specifies whether to enable the SKEWJOIN feature to prevent long tails.

    • True: The SKEWJOIN feature is enabled.

    • False: The SKEWJOIN feature is disabled.

    odps.sql.skewinfo

    Specifies the key and value on which SKEWJOIN is executed. For more information about SKEWJOIN, see Data skew tuning.

    None.

    odps.sql.reducer.instances

    Specifies the number of hash buckets. For more information about hash buckets, see Table operations.

    0-4000

    odps.sql.udf.ppr.deterministic

    Specifies whether to enable partition pruning for UDFs. For more information about partition pruning, see WHERE clause (where_condition).

    • True: Partition pruning is enabled.

    • False: Partition pruning is disabled.

    odps.sql.udf.ppr.to.subquery

    Specifies whether to ignore the error that occurs during result backfilling when partition pruning is being performed. For more information about partition pruning, see WHERE clause (where_condition).

    • True: The error is ignored.

    • False: The error is not ignored.

    odps.optimizer.enable.range.partial.repartitioning

    Specifies whether to enable the Shuffle Remove feature for range-clustered tables.

    • True: The Shuffle Remove feature is enabled.

    • False: The shuffle Remove feature is disabled.

    odps.optimizer.skew.join.topk.num

    Specifies the number of hot key values obtained by the optimizer when it performs an Aggregate operation. For more information, see SKEWJOIN HINT.

    None.

    odps.optimizer.stat.collect.auto

    Specifies whether to enable the Freeride feature. After the Freeride feature is enabled, the column stats metrics of tables are automatically collected. For more information, see Collect information for the optimizer of MaxCompute.

    • True: The Freeride feature is enabled.

    • False: The Freeride feature is disabled.

    odps.optimizer.stat.collect.plan

    Specifies a collection plan to collect specific column stats metrics of specific columns. For more information, see Collect information for the optimizer of MaxCompute.

    None.

    odps.sql.executionengine.batch.rowcount

    Specifies the number of data rows that can be processed by the SQL engine at a time. Default value: 1024. If an OOM error occurs because of a large field in a row or if a timeout occurs due to poor UDF performance, you must set this parameter to a smaller value. We recommend that you do not set this parameter to an excessively small value because an excessively small value may affect the performance of the SQL engine.

    1-1024

    odps.sql.external.net.vpc

    Specifies whether to enable virtual private cloud (VPC) for external tables. For more information, see Hologres foreign tables.

    • True: VPC is enabled for external tables.

    • False: VPC is disabled for external tables.

    odps.sql.groupby.position.alias

    Specifies whether to process integer constants in the GROUP BY clause as column numbers in a SELECT statement.

    • True: Integer constants are processed as column numbers.

    • False: Integer constants are not processed as column numbers.

    odps.sql.groupby.skewindata

    Specifies whether to enable the anti-skew feature for the GROUP BY clause.

    • True: The anti-skew feature for the GROUP BY clause is enabled.

    • False: The anti-skew feature for the GROUP BY clause is disabled.

    odps.sql.orderby.position.alias

    Specifies whether to process integer constants in the ORDER BY clause as column numbers in a SELECT statement.

    • True: Integer constants are processed as column numbers.

    • False: Integer constants are not processed as column numbers.

    odps.sql.groupby.orderby.position.alias

    Specifies whether to process integer constants in the GROUP BY and ORDER BY clauses as column numbers in a SELECT statement. We recommend that you specify this parameter.

    Note

    If you set this parameter to True for projects in which jobs are running, the job parsing and running may be affected. Therefore, you must make sure that existing jobs can still run properly based on the original logic when you set the parameter to True. Otherwise, specify this parameter for sessions.

    • True: Integer constants in the GROUP BY and ORDER BY clauses are processed as column numbers in SELECT statements.

    • False: Integer constants in the GROUP BY and ORDER BY clauses are not processed as column numbers in SELECT statements.

    odps.sql.mapjoin.memory.max

    Specifies the size of small tables that are read to the memory when the MAPJOIN statement is executed. Unit: MB.

    0 MB-8192 MB

    odps.sql.metering.value.max

    Specifies the upper limit on resources consumed by an SQL statement. For more information, see Consumption control.

    None.

    odps.sql.python.version

    Specifies the Python version on which SQL statements are executed.

    • cp27

    • cp37

    odps.sql.select.output.format

    Specifies whether to display the table headers in the returned results of the MaxCompute client. For more information, see MaxCompute client (odpscmd).

    • {""needHeader"":false,""fieldDelim"":""""}: The table headers are not displayed.

    • {{""needHeader"":true,""fieldDelim"":""""}: The table headers are displayed.

    odps.sql.timezone

    Specifies the time zone of the MaxCompute project that you accessed. For more information about time zones, see Time zone configuration operations.

    None.

    odps.sql.unstructured.data.oss.use.https

    Specifies whether to enable the mechanism of obtaining data by using HTTPS at the underlying layer when you access external tables. For more information, see Create an OSS external table.

    • True: The mechanism is enabled.

    • False: The mechanism is disabled.

    odps.sql.unstructured.oss.commit.mode

    Specifies whether to enable the multipart upload feature of Object Storage Service (OSS) to write data to OSS external tables. For more information, see Write data to OSS.

    • True: The multipart upload feature is enabled.

    • False: The multipart upload feature is disabled.

Examples

-- Set the size of data read by each Mapper to 256 MB. 
set odps.stage.mapper.split.size=256;

Related statements

SHOW: This statement is used to display the properties that you configured by using the SET statement.