MaxCompute Engine Versions Overview - MaxCompute

This topic provides release notes for MaxCompute product features.

July 2025

MaxCompute Engine V52 was officially released in July 2025. This version adds or enhances the following features:

Data warehouse engine

New features
- Support for ingestion-time partitioned tables is added. When you write data, the system automatically captures the ingestion time. You can specify a time calculation function, TRUNC_TIME, to compute the partition column value based on the ingestion time. This enables automatic table partitioning.
- Enhanced partition pruning for auto-partitioned tables: Partition pruning for auto-partitioned tables is enhanced. Partition pruning now supports applying time or date functions to time or date columns, such as DATE, DATETIME, TIMESTAMP, and TIMESTAMP_NTZ, during table creation. The computed result can be used as a filter condition for partition pruning. For more information, see Partition pruning for auto-partitioned tables using time calculation functions.
Language and syntax
- Support for GROUP BY ALL is added. To use this feature, you can run the SET odps.sql.bigquery.compatible=true; command to enable BigQuery compatibility mode. We recommend that you enable this feature at the project level.
- The DELETE FROM/UPDATE syntax now supports table aliases. You can use the following formats:
  - DELETE FROM <table_name> [[as] alias] [WHERE <condition>];
  - UPDATE <table_name> [[as] alias] SET ...;
Built-in functions
- Date and time functions
  - The ISOYEAR function is added. This function extracts the ISO 8601 year from a date input.
  - The CURRENT_DATE, CURRENT_TIMESTAMP_NTZ, and CURRENT_MICROS functions are added. These functions accept a time zone parameter to return the current time in the specified time zone.
  - The following time series functions are added: TIME_ADD, TIME_SUB, TIME_DIFF, TO_TIME, FORMAT_TIME, and TIME_TRUNC. These functions support constructing, converting, and computing time values that are represented as BIGINT in microseconds.
  - The DATEADD, DATEDIFF, DATETRUNC, and LAST_DAY functions now support additional date_part values.
  - The WEEKOFYEAR function now supports additional date_part values and accepts DATE-type input parameters.
- Network functions
  A set of NET network functions is added: NET_IP_FROM_STRING, NET_SAFE_IP_FROM_STRING, NET_IP_NET_MASK, NET_IPV4_TO_INT64, NET_IP_TO_STRING, NET_HOST, NET_PUBLIC_SUFFIX, and NET_REG_DOMAIN. These functions process STRING and BINARY network-related data. They support IP address format conversion, URL parsing, and network mask retrieval.
- String functions
  - The following string functions are added: BASE32, CODEPOINT_ARRAY, SAFE_CONVERT_BYTES_TO_STRING, and FORMAT_STRING. These functions provide flexible binary data processing and string formatting capabilities.
  - The REGEXP_CONTAINS function is added. This function checks whether a string contains content that matches a specified string or regular expression.
- Mathematical functions
  The LEAST and GREATEST functions now support DATE-type input parameters.
- Other
  When you run the SET odps.sql.bigquery.compatible=true; command to enable BigQuery compatibility mode, the behavior of some functions changes:
  - In BigQuery compatibility mode, the SUBSTR function handles position = 0 differently.
  - In BigQuery compatibility mode, the REVERSE function handles STRING-type data differently.
  - In BigQuery compatibility mode, the STDDEV_SAMP function behaves differently when the window contains only one non-NULL expr value.
  - In BigQuery compatibility mode, the REGEXP_EXTRACT and REGEXP_EXTRACT_ALL functions behave differently when the pattern has no groups and no group_id is specified.

April 2025

MaxCompute Engine V51 was officially released in April 2025. This version adds or enhances the following features:

Data warehouse engine

New features
Support for auto-partitioned tables that use time calculation functions (AUTO PARTITION) is added. You can apply specific time calculation functions to time or date columns, such as DATE, DATETIME, TIMESTAMP, and TIMESTAMP_NTZ, in a table. The system automatically computes the partition column value based on the result and partitions the table. For more information, see Auto-partitioned tables using time calculation functions.
Feature enhancements
- When you rebuild BloomFilter indexes with the ALTER TABLE REBUILD INDEX command, you can rebuild multiple partitions at once. For more information, see Generate a BloomFilter index.
- When you create materialized views, MAX_PT is now supported in addition to GETDATE. During query rewriting, the system uses the MAX_PT value at the time of materialized view creation. This improves the coverage of query rewriting. For more information, see Create materialized views (with partitioning and clustering).
Language and Syntax
- The STRUCT syntax is enhanced to support STRUCT(*). This syntax lets you reference or operate on all fields of a STRUCT in a query. For more information, see STRUCT data type.
- When you run the SHOW HISTORY FOR TASK command to display the task history, you can specify the start position and length to control the output more precisely. For more information, see View the history list of automatically scheduled periodic task instances.
Built-in functions
- The TRUNC_TIME function is added. This function truncates date or time data by the time unit that is specified in datepart and returns a STRING. For more information, see TRUNC_TIME.
- The BOOL_AND and BOOL_OR functions are added. BOOL_AND performs a logical AND operation on a set of Boolean values. BOOL_OR performs a logical OR operation. Both functions return the logical result of all non-NULL input values. For more information, see BOOL_AND and BOOL_OR.
- The BIT_COUNT function is added. This function counts the number of 1s in the binary representation of an input value. For more information, see BIT_COUNT.
- The BITWISE_XOR_AGG function is added. This function computes the bitwise XOR of all input values. For more information, see BITWISE_XOR_AGG.
- The ANY function is added. This function checks whether at least one element in the input is TRUE. For more information, see ANY.
Performance and parameter upgrades
The Shuffle Removal feature is further enhanced. Unnecessary shuffles are eliminated when you write to Cluster tables to improve job performance.

February 2025

MaxCompute SQL V50 is officially released in February 2025. The following features are added or enhanced in this version:

Data warehouse engine

Data formats
The DECIMAL data type supports a higher scale. The scale value of `DECIMAL(precision,scale)` indicates the number of digits in the fractional part of a DECIMAL value. You can run the SET odps.sql.decimal2.extended.scale.enable=true; command to extend the range of the scale from [0, 18] to [0, 38]. For more information, see MaxCompute data type system version 2.0.
Syntax and function enhancements
- Enhanced STRUCT syntax: The STRUCT syntax is enhanced to support the STRUCT expression syntax. This syntax lets you use named expressions to construct data of the STRUCT type. This provides a new method to construct this complex data type. For more information, see STRUCT data type.
- Enhanced FIND_IN_SET function: This function can be used to search for a substring within a string that consists of substrings separated by a delimiter and return the position of the substring. In earlier MaxCompute SQL versions, only a comma (,) can be used as the delimiter for this function. After the enhancement, you can specify a custom delimiter of the STRING type for this function. This enables more flexible string search. For more information, see FIND_IN_SET.
- New built-in function GET_DATA_FROM_OSS: The `GET_DATA_FROM_OSS` built-in function is added. This function can be used to download Object Storage Service (OSS) object data from Object Tables and return binary data for subsequent computing. For more information, see Query object content for business computing.
Feature enhancements
- MATERIALIZED CTE: When you define a common table expression (CTE), you can use a `MATERIALIZE` hint in the `SELECT` statement to cache the CTE calculation result to a temporary table. When you subsequently access the CTE, you can directly read the result from the cache. This helps avoid out-of-memory (OOM) issues in multi-layer CTE nesting scenarios and improves the performance of the CTE statement. For more information, see MATERIALIZE CTE.
- Enhanced observability of Bloom filter indexes: The time taken to merge Bloom filter indexes is displayed on the SubStatusHistory tab of LogView. For more information, see Generate a Bloom filter index.
- Enhanced materialized views: Materialized views are enhanced. The query rewrite feature of materialized views is improved to support more operators, including `DISTRIBUTED BY`, `ORDER BY`, `ORDER BY` + `LIMIT`, and `LIMIT`. For more information, see Materialized view query rewrite.
Performance and parameter upgrade
- For the ARRAY_CONTAINS function, the optimizer automatically detects when an input parameter is processed by the SPLIT function in string lookup scenarios. By default, the optimizer converts this expression to an equivalent FIND_IN_SET operation and supports additional separator types. For example, ARRAY_CONTAINS(SPLIT(c1, '_'), c2) is automatically optimized to FIND_IN_SET(c1, c2, '_'), which improves the execution performance of ARRAY_CONTAINS.
- By default, the Reshuffle Split capability is enabled for dynamic partitioning to optimize scenarios involving dynamic partition reshuffling. This optimization splits the data flow of dynamic partitions and performs reshuffling only on a single path, thereby reducing the overhead of dynamic partition reshuffling while avoiding the issue of excessive small files.
- The Shuffle Removal capability is further enhanced to eliminate unnecessary shuffles in MAPJOIN and PARTITIONED HASH JOIN scenarios and improve job performance.

Near real-time data warehousing

Delta Table CDC feature release
Delta Table Change Data Capture (CDC) identifies and captures changes to database table data. It records row-level insert, update, and delete operations on Delta Table incremental tables. This effectively captures data change events for the table. Delta Table CDC automatically generates timed scheduling tasks when created. These tasks execute asynchronous jobs to generate CDC data. The syntax includes `CREATE`, `ALTER`, and `DROP`. You can use CDC events to drive incremental computing, data synchronization, and data warehouse layering. For more information, see CDC (invitational preview).
New Stream data stream object feature
A Stream is a data stream object that is managed by MaxCompute. It automatically manages the version of incremental query data for Delta Tables. A Stream records DML changes, such as inserts, updates, and deletes, to incremental tables. Each change includes metadata to help you act on the changed data. For more information, see Stream objects.
Periodic scheduling task feature release
MaxCompute periodic scheduling tasks (Periodic Tasks) let you flexibly customize scheduling policies. They execute SQL logic to automatically build data pipelines and simplify ETL workflows. You can combine tasks with table data changes and stream reads to create continuous ETL workflows that process recently changed data. For more information, see Periodic scheduling tasks (invitational preview).
Data writing from Flink to Delta tables
MaxCompute Delta tables support multiple data write methods. MaxCompute provides a new version of the Flink connector plug-in. The Flink connector plug-in can be used to write data from Flink to MaxCompute standard tables and Delta tables. This facilitates data writing from Flink to MaxCompute. In addition, Flink CDC data can be directly written to Delta tables. For more information, see Upload data to MaxCompute with Flink.

Lakehouse and the external table feature

When you create an external table to parse data in the PARQUET format, implicit conversions are supported for some data types, such as TINYINT, SMALLINT, and DATETIME.
The MAX_PT function can be used to query the latest partition of an external table. You can use this function to query the latest partition that contains data in an OSS external table. For more information, see MAX_PT.

November 2024

The MaxCompute SQL V49 version was officially released in November 2024. The following features are added or enhanced in this version:

Data warehouse engine

New features
- Bitmap index: You can create bitmap indexes for low-cardinality columns in which many duplicate values are available. In range filtering scenarios, you can use bitmap indexes to filter out more than 50% of data at most. This helps accelerate queries. For more information, see Bitmap index (beta).
- Bloom filter index: A Bloom filter is an efficient probabilistic data structure. MaxCompute lets you use Bloom filter indexes to perform large-scale point queries. This reduces unnecessary data scanning during queries and improves the overall query efficiency and performance. For more information, see Bloom filter index (Beta).
Built-in functions
- The built-in function JSON_EXPLODE is added. This function is used to expand each element in a JSON array or JSON object into multiple rows. For more information, see JSON_EXPLODE.
Language Enhancement
- You can use the OR REPLACE clause in the CREATE TABLE statement to update the metadata of a table. If the destination table already exists, you can directly update the metadata of the table. You do not need to delete an existing table before you re-create the table. This simplifies the use of SQL statements and improves ease of use. For more information, see Create and drop tables.
- Single-line and multi-line comments can be used in MaxCompute SQL scripts to improve code readability. For more information, see SQL comments.
- SUBQUERY_MAPJOIN HINT is supported. Specific subqueries such as SCALAR, IN, and EXISTS are transformed into JOIN operations during the execution. You can explicitly specify hints to use the MAPJOIN algorithm to improve the execution efficiency. For more information, see SUBQUERY_MAPJOIN HINT.
New parameters
- The original parameter odps.stage.mapper.split.size supports the overall settings in the map stage. You can use the new flag to flexibly split tables by row or concurrency. The new flag can improve task concurrency in scenarios where the size of each row in the table is small and the workloads of subsequent computing operations are heavy. For more information, see Flag parameters.
- If a query repeatedly accesses the same partitioned table, you can run the set odps.optimizer.merge.partitioned.table=true; command to enable the system to merge the access operations on the partitioned table. This minimizes the I/O operations on the partitioned table and improves query performance. For more information, see Flag parameters.
Behavior changes
- The dynamic-to-static partition optimization feature improves query performance for all DML operations and is enabled by default. This optimization affects the behavior of UPDATE, DELETE, and MERGE INTO operations. For more information, see Announcement on the Dynamic-to-Static Partition Optimization.

Big data AI (MaxFrame)

LogView 2.0 is compatible with MaxFrame and supports the following MaxFrame-related features. For more information, see Use LogView 2.0 to view MaxFrame jobs.
- Allows you to view the execution records and running durations of all directed acyclic graphs (DAGs) submitted in MaxFrame sessions.
- Allows you to interactively view the execution sequence, running time, operator topology, and status relationships of sub-DAGs in each DAG.
- Allows you to view the settings, status, memory usage, and CPU utilization of each child instance.
MaxFrame offers an automatic packaging service to simplify the management of third-party packages in Python-based job development. This service lets you declare required external dependency files during job development. When the job is running, the dependency files are automatically packaged and integrated into the job development environment. You do not need to manually upload the packages. This simplifies package management. For more information, see Automated packaging service.

Lakehouse and the external table feature

ZSTD compression is supported during data writes to a Parquet external table based on the JNI interface.
Before this feature is introduced, only the uncompressed files and Snappy files can be written when you create a Parquet external table. After this feature is introduced, files that are compressed using the ZSTD compression algorithm can also be written in this scenario. This helps improve the compression ratio and read and write performance and achieves greater cost-effectiveness. For more information, see ORC Foreign Tables.
CsvStorageHandler and OpenCsvSerde are provided to support more data types for data reads and writes.
- MaxCompute provides the read and write standards OpenCsvSerde, which is referred to as CsvSerde and is compatible with Hive. Data types supported by OpenCsvSerde are Hive-compatible data types. MaxCompute also provides the custom read and write standards CsvStorageHandler, which is referred to as CsvHandler. Data types supported by CsvStorageHandler are data types supported in the MaxCompute V2.0 data type edition. The data types of CsvSerde and CsvHandler are not totally the same, but still have the intersection of multiple basic data types, such as INT and FLOAT. However, there are still many differences in the parsing behavior of these data types. A unified standard has not yet been formulated. For example, for the FLOAT type, CsvSerde defines the processing of special values such as INF. However, CsvHandler does not process special values and only attempts to use the parseFloat method for parsing. As a result, the behaviors for parsing basic data types may not be the same when you use both CsvHandler and CsvSerde.
- CsvStorageHandler supports multiple basic data types, such as BOOLEAN, TIMESTAMP, DATE, and DATETIME. This lets you export data of all data types from MaxCompute to OSS and store the data in the CSV format. After cross-region replication is performed based on OSS, data can be restored to MaxCompute.
OSS external tables allow you to perform STS authentication by assuming a RAM role.
This feature is optimized to allow you to access MaxCompute by assuming a RAM role (no AccessKey pairs required) or access MaxCompute from other cloud services by assuming a RAM role in scenarios where external tables are involved. Before this feature is optimized, when you assume a role in the preceding scenarios, the system cannot obtain the user information of the RAM role because the external tables contain the RAM role that is integrated with table properties for MaxCompute to access peer services. As a result, you cannot access the external tables by assuming the RAM role. After this feature is optimized, the use of RAM roles and seamless access based on RAM roles are not adversely affected even if external tables are involved. For more information, see ORC Foreign Tables.
The optimizer supports statistics of temporary statistics tables during queries to identify small tables and optimize query plans.
Since the data queried using external tables is stored in an external data lake, the system does not establish metadata locally to ensure data openness. In this case, if you do not collect statistics in advance, the optimizer uses conservative policies, which decrease the query efficiency. The optimizer now supports statistics of temporary statistics tables during queries to identify small tables. This lets you proactively use various methods to optimize query plans. The methods include performing hash join operations, optimizing the join order, reducing many shuffle operations, and shortening the executed pipelines. For more information, see Paimon external table.