Document Center

All Products

Document Center

MaxCompute:Updates in 2022

Last Updated:Jun 26, 2023

This topic describes the latest updates to MaxCompute features in 2022.

For more information about the major feature updates of MaxCompute, see Product Updates.

Updates in December 2022

Feature	Description	Release date	Region	References
WINDOW keyword in SQL statements	The WINDOW keyword is supported. You can use the WINDOW keyword to specify a custom window only once and reuse the custom window.	2022-12-14	All regions	WINDOW keyword
From clause used together with the UPDATE statement	The UPDATE statement can be used together with the from clause to update data.	2022-12-14	All regions	UPDATE and DELETE
New string function and optimization of string functions and aggregate functions	The following built-in string functions are optimized: CAST, SPLIT, and RAND. The window functions NUMERIC_HISTOGRAM and PERCENTILE_APPROX are optimized. The MASK_HASH function is added.	2022-12-14	All regions	Overview

Updates in November 2022

Feature	Description	Release date	Region	References
Custom administrator roles	MaxCompute provides the following built-in administrator roles for projects: Admin and Super_Administrator. MaxCompute also allows you to create custom administrator roles. To create a custom administrator role, you can perform the following operations in the MaxCompute console of the new version: In the left-side navigation pane of the MaxCompute console, click Projects. On the Projects page, find the desired project and click Manage in the Actions column. On the page that appears, click the Role Permissions tab and click Create Project-level Role. In the Create Role dialog box, select Admin from the Role Type drop-down list. You can grant only specific administrator permissions to custom administrator roles. For example, you can grant permissions to a role to only manage permissions or to only manage IP address whitelists.	2022-11-15	All regions	MaxCompute permissions

Updates in October 2022

Feature	Description	Release date	Region	References
Upgrade of the data storage hierarchy from `project.table` to `project.schema.table` to allow MaxCompute to connect to data sources that use the three-level data storage hierarchy	A MaxCompute project is a basic organizational unit of MaxCompute and is used for isolation and access control of multiple users. A MaxCompute project contains objects such as tables, resources, and functions. Before the schema feature is provided by MaxCompute, the objects are directly placed in projects. Projects serve as databases or schemas based on the hierarchy of traditional databases. This may cause misunderstanding and inconvenience to users, especially when a large number of tables or objects exist. MaxCompute provides the schema feature. You can use schemas to classify tables, resources, and functions in projects. If the original hierarchy of data storage is `project.schema.table` and you want to migrate data to MaxCompute, you can use the schema feature of MaxCompute to directly align the hierarchy with the data source hierarchy without the need to reconstruct your business during the migration. The schema feature helps reduce the workload.	2022-10-13	All regions	Schema-related operations

Updates in September 2022

Feature	Description	Release date	Region	References
Creation of Hologres external tables in dual-signature authentication mode	The dual-signature authentication mode is an authentication protocol that is developed based on MaxCompute and Hologres. After the user logon information and signature are used on MaxCompute, authentication data is sent to Hologres. Then, Hologres performs authentication by using the same username based on the protocol that is compatible with the underlying layer of MaxCompute. This way, you can directly access Hologres external tables without the need to configure additional authentication information only if you use the same Alibaba Cloud account to access MaxCompute and Hologres.	2022-09-24	All regions	Hologres foreign tables
Creation of a MaxCompute table that has the same schema as a table in external data sources by using `CREATE TABLE LIKE`	You can use the MaxCompute lakehouse solution to create a MaxCompute table that has the same schema as a table in external data sources such as E-MapReduce (EMR), Hadoop, and Data Lake Formation (DLF). You can use the `CREATE TABLE LIKE` statement to migrate table schemas from external data sources to MaxCompute. This helps improve data governance and access performance.	2022-09-23	All regions	Use SQL statements to manage an external project

Updates in August 2022

Feature	Description	Release date	Region	References
Automatic deletion of expired partitioned tables	If the lifecycle of partition data in a partitioned table expires, MaxCompute automatically reclaims the partition data. When the data of all partitions is reclaimed, MaxCompute can automatically delete the partitioned table. You can configure parameters to enable the automatic table deletion feature.	2022-08-27	All regions	Lifecycle management operations
New aggregate functions	The following aggregate functions are added: `BITWISE_AND_AGG`, `MIN_BY`, and `MAX_BY`. The BITWISE_AND_AGG function is used to return the bitwise AND value of all input values. The MIN_BY function is used to return the value of a column that corresponds to the row in which the minimum value of another column is located. The MAX_BY function is used to return the value of a column that corresponds to the row in which the maximum value of another column is located.	2022-08-27	All regions	Aggregate functions
Schema copy of external tables	When you create an internal table in MaxCompute, you can use the `CREATE TABLE LIKE` statement to copy the schema of an external table. This feature helps improve the table creation efficiency.	2022-08-27	All regions	Create a table
New function for materialized views	A function is added to MaxCompute. This function allows you to query the status of materialized views. You can use this function to check whether the data of the current materialized view is the same as the data of the original table. You can also use this function to check whether the data of a partition in the current materialized view is the same as the data of the mapped partition in the original table. If the data is the same, True is returned. If the data is different, False is returned.	2022-08-27	All regions	Materialized view operations
Empty partitions generated in materialized views	When you update the data of a partition in a partitioned materialized view, MaxCompute can generate an empty partition in the partitioned materialized view if the calculation result of the materialized view indicates that the partition does not contain data. This ensures that partitions are consecutively generated.	2022-08-27	All regions	Materialized view operations
Job-level quota groups	MaxCompute allows you to specify a quota group at the job level. This helps you use quota groups in a flexible manner. If specific jobs in a project occupy a large number of resources, the overall timeliness for jobs in the project is affected. For example, data refresh jobs occupy a large number of resources but have low timeliness requirements, and specific algorithm jobs occupy a large number of resources and have high timeliness requirements. In this case, you can specify different quota groups for the jobs to isolate resources between the jobs for data computation. This way, you do not need to create another project to migrate the jobs to the new project and associate a quota group with the project to implement resource isolation.	2022-08-23	All regions	Use of computing resources

Updates in July 2022

Feature	Description	Release date	Region	References
Addition of a regular function	The regular function `REGEXP_EXTRACT_ALL` is added to MaxCompute. This regular function is used to match all substrings that meet the specified requirements from a string that you want to process at a time and return the substrings as an array. This regular function helps improve the data processing efficiency.	2022-07-14	All regions	String functions
Custom prefixes and extensions of exported file names	When you use the Unload function to export data from MaxCompute to OSS, you can configure prefixes and extensions for the exported data files.	2022-07-14	All regions	UNLOAD
Table split size settings	MaxCompute allows you to configure a split size for tables to control the job parallelism. If resources are sufficient but jobs run at a low speed or if jobs take a large amount of time to wait for resource allocation and resources are insufficient, you can adjust the split size to improve the computing efficiency.	2022-07-14	All regions	SELECT syntax
Addition and performance tuning of window functions	The `FIRST_VALUE`, `LAST_VALUE`, and `NTH_VALUE` window functions are added. Performance tuning of all window functions is improved to significantly increase the computing performance of window functions.	2022-07-14	All regions	Window functions
Addition of aggregate functions	The following aggregate functions are added: `BITWISE_OR_AGG`, `MAP_AGG`, `MULTIMAP_AGG`, `MAP_UNION`, `MAP_UNION_SUM`, and `HISTOGRAM`. You can use these aggregate functions to aggregate the input bits or maps to facilitate data analysis and statistics.	2022-07-14	All regions	Aggregate functions

Updates in June 2022

Feature	Description	Release date	Region	References
Exclusive resource quota for MaxCompute subscription projects	An exclusive resource quota can be created for MaxCompute subscription projects. This way, compute units (CUs) of this quota cannot be occupied by jobs that use other quotas even if some CUs of the quota are idle. You can configure an exclusive resource quota for services such as business intelligence (BI) and ALGO because these services may use CUs of a quota at any time. If some CUs of a quota are idle, the idle CUs may be occupied by jobs that use another quota in which the value of Maximum Reserved CUs is greater than the value of Minimum Reserved CUs. To prevent CUs of a quota from being occupied by jobs that use other quotas for a long period of time, you can configure an exclusive resource quota.	2022-06-27	All regions	Use MaxCompute Management
Maximum number of parallel CUs for a job that uses a subscription quota	The maximum number of parallel CUs that are used in a single job can be configured. If a MaxCompute job that uses a subscription quota occupies CUs for a long period of time, other jobs may keep waiting for CUs. This feature helps prevent a MaxCompute job that uses a subscription quota from occupying CUs for a long period of time. Before you configure the maximum number of parallel CUs that are used in a single job, you need to evaluate the parallelism based on your business requirements. If you set the parallelism to an excessively small value and an excessively small number of jobs exist, the jobs may run at a low speed but some CUs of the quota are not occupied.	2022-06-27	All regions	Use MaxCompute Management

Updates in May 2022

Feature	Description	Release date	Region	References
Separate charging for MaxCompute external tables based on the types of external tables	MaxCompute external tables are separately charged based on the types of external tables. You can view the fees that are incurred by OSS external tables and Tablestore external tables in your bills. This way, you can view the fees that are incurred by different data sources for joint computing.	2022-05-17	All regions	View billing details

Updates in March 2022

Feature	Description	Release date	Region	References
DISTRIBUTED MAPJOIN	In specific scenarios, DISTRIBUTED MAPJOIN hints can be used to improve the computing performance and reduce the time required for data computation.	2022-03-17	All regions	DISTRIBUTED MAPJOIN
Enhancement of OSS external tables	When MaxCompute writes data to an OSS external table, MaxCompute can automatically create a directory to store the data. When you create an OSS external table, you can specify the cache capacity for reading files.	2022-03-17	All regions	Create an OSS external table
New parsing method of JSON data	MaxCompute allows you to enclose a key in JSON data whose value contains a period (.) with brackets and single quotation marks (['']) to parse the JSON data.	2022-03-17	All regions	GET_JSON_OBJECT_TUPLE and JSON_TUPLE
Enhancement of the TRIM, LTRIM, and RTRIM functions	MaxCompute allows you to use the TRIM, LTRIM, and RTRIM functions to remove specified characters from the left side, the right side, or both sides of a string.	2022-03-17	All regions	String functions
Enhancement of the query rewrite feature for materialized views	The query rewrite feature can be enabled for a materialized view to help you rewrite a query statement that includes an OUTER JOIN clause, a UNION clause, or a UNION All clause.	2022-03-17	All regions	Materialized view operations
Skipping of headers or footers of TEXTFILE files	The `skip.header.line.count` and `skip.footer.line.count` parameters can be used to skip the first and last rows of a CSV file during data processing in MaxCompute. The parameters can also be used if a CSV file is compressed in the .gz, .bz2, or .lzo format.	2022-03-01	All regions	Create an OSS external table
Compatibility with Apache Spark 3.1	MaxCompute is compatible with Apache Spark 3.1 in addition to the following Apache Spark versions: 1.6, 2.3, and 2.4.	2022-03-01	All regions	Set up a Linux development environment

Updates in February 2022

Feature	Description	Release date	Region	References
Data security management of Logview	A custom parameter can be used to specify whether to display the execution results of jobs in Logview of MaxCompute. This enhances data security.	2022-02-25	All regions	Project operations
Table schema changes	The schema of a table in MaxCompute can be changed. You can add fields of complex data types to a table, delete fields from a table, and change the sequence of fields in a table.	2022-02-23	All regions	Partition and column operations

Updates in January 2022

Feature	Description	Release date	Region	References
Display of MaxCompute external project metadata on DataWorks Data Map	The metadata of MaxCompute external projects can be displayed on DataWorks Data Map.	2022-01-10	Germany (Frankfurt) and Singapore	Lakehouse of MaxCompute