All Products
Search
Document Center

MaxCompute:Updates in 2022

Last Updated:Jun 26, 2023

This topic describes the latest updates to MaxCompute features in 2022.

For more information about the major feature updates of MaxCompute, see Product Updates.

Updates in December 2022

Feature

Description

Release date

Region

References

WINDOW keyword in SQL statements

The WINDOW keyword is supported. You can use the WINDOW keyword to specify a custom window only once and reuse the custom window.

2022-12-14

All regions

WINDOW keyword

From clause used together with the UPDATE statement

The UPDATE statement can be used together with the from clause to update data.

2022-12-14

All regions

UPDATE and DELETE

New string function and optimization of string functions and aggregate functions

  • The following built-in string functions are optimized: CAST, SPLIT, and RAND.

  • The window functions NUMERIC_HISTOGRAM and PERCENTILE_APPROX are optimized.

  • The MASK_HASH function is added.

2022-12-14

All regions

Overview

Updates in November 2022

Feature

Description

Release date

Region

References

Custom administrator roles

MaxCompute provides the following built-in administrator roles for projects: Admin and Super_Administrator. MaxCompute also allows you to create custom administrator roles. To create a custom administrator role, you can perform the following operations in the MaxCompute console of the new version: In the left-side navigation pane of the MaxCompute console, click Projects. On the Projects page, find the desired project and click Manage in the Actions column. On the page that appears, click the Role Permissions tab and click Create Project-level Role. In the Create Role dialog box, select Admin from the Role Type drop-down list. You can grant only specific administrator permissions to custom administrator roles. For example, you can grant permissions to a role to only manage permissions or to only manage IP address whitelists.

2022-11-15

All regions

MaxCompute permissions

Updates in October 2022

Feature

Description

Release date

Region

References

Upgrade of the data storage hierarchy from project.table to project.schema.table to allow MaxCompute to connect to data sources that use the three-level data storage hierarchy

A MaxCompute project is a basic organizational unit of MaxCompute and is used for isolation and access control of multiple users. A MaxCompute project contains objects such as tables, resources, and functions. Before the schema feature is provided by MaxCompute, the objects are directly placed in projects. Projects serve as databases or schemas based on the hierarchy of traditional databases. This may cause misunderstanding and inconvenience to users, especially when a large number of tables or objects exist. MaxCompute provides the schema feature. You can use schemas to classify tables, resources, and functions in projects. If the original hierarchy of data storage is project.schema.table and you want to migrate data to MaxCompute, you can use the schema feature of MaxCompute to directly align the hierarchy with the data source hierarchy without the need to reconstruct your business during the migration. The schema feature helps reduce the workload.

2022-10-13

All regions

Schema-related operations

Updates in September 2022

Feature

Description

Release date

Region

References

Creation of Hologres external tables in dual-signature authentication mode

The dual-signature authentication mode is an authentication protocol that is developed based on MaxCompute and Hologres. After the user logon information and signature are used on MaxCompute, authentication data is sent to Hologres. Then, Hologres performs authentication by using the same username based on the protocol that is compatible with the underlying layer of MaxCompute. This way, you can directly access Hologres external tables without the need to configure additional authentication information only if you use the same Alibaba Cloud account to access MaxCompute and Hologres.

2022-09-24

All regions

Hologres foreign tables

Creation of a MaxCompute table that has the same schema as a table in external data sources by using CREATE TABLE LIKE

You can use the MaxCompute lakehouse solution to create a MaxCompute table that has the same schema as a table in external data sources such as E-MapReduce (EMR), Hadoop, and Data Lake Formation (DLF). You can use the CREATE TABLE LIKE statement to migrate table schemas from external data sources to MaxCompute. This helps improve data governance and access performance.

2022-09-23

All regions

Use SQL statements to manage an external project

Updates in August 2022

Feature

Description

Release date

Region

References

Automatic deletion of expired partitioned tables

If the lifecycle of partition data in a partitioned table expires, MaxCompute automatically reclaims the partition data. When the data of all partitions is reclaimed, MaxCompute can automatically delete the partitioned table. You can configure parameters to enable the automatic table deletion feature.

2022-08-27

All regions

Lifecycle management operations

New aggregate functions

The following aggregate functions are added: BITWISE_AND_AGG, MIN_BY, and MAX_BY. The BITWISE_AND_AGG function is used to return the bitwise AND value of all input values. The MIN_BY function is used to return the value of a column that corresponds to the row in which the minimum value of another column is located. The MAX_BY function is used to return the value of a column that corresponds to the row in which the maximum value of another column is located.

2022-08-27

All regions

Aggregate functions

Schema copy of external tables

When you create an internal table in MaxCompute, you can use the CREATE TABLE LIKE statement to copy the schema of an external table. This feature helps improve the table creation efficiency.

2022-08-27

All regions

Create a table

New function for materialized views

A function is added to MaxCompute. This function allows you to query the status of materialized views. You can use this function to check whether the data of the current materialized view is the same as the data of the original table. You can also use this function to check whether the data of a partition in the current materialized view is the same as the data of the mapped partition in the original table. If the data is the same, True is returned. If the data is different, False is returned.

2022-08-27

All regions

Materialized view operations

Empty partitions generated in materialized views

When you update the data of a partition in a partitioned materialized view, MaxCompute can generate an empty partition in the partitioned materialized view if the calculation result of the materialized view indicates that the partition does not contain data. This ensures that partitions are consecutively generated.

2022-08-27

All regions

Materialized view operations

Job-level quota groups

MaxCompute allows you to specify a quota group at the job level. This helps you use quota groups in a flexible manner. If specific jobs in a project occupy a large number of resources, the overall timeliness for jobs in the project is affected. For example, data refresh jobs occupy a large number of resources but have low timeliness requirements, and specific algorithm jobs occupy a large number of resources and have high timeliness requirements. In this case, you can specify different quota groups for the jobs to isolate resources between the jobs for data computation. This way, you do not need to create another project to migrate the jobs to the new project and associate a quota group with the project to implement resource isolation.

2022-08-23

All regions

Use of computing resources

Updates in July 2022

Feature

Description

Release date

Region

References

Addition of a regular function

The regular function REGEXP_EXTRACT_ALL is added to MaxCompute. This regular function is used to match all substrings that meet the specified requirements from a string that you want to process at a time and return the substrings as an array. This regular function helps improve the data processing efficiency.

2022-07-14

All regions

String functions

Custom prefixes and extensions of exported file names

When you use the Unload function to export data from MaxCompute to OSS, you can configure prefixes and extensions for the exported data files.

2022-07-14

All regions

UNLOAD

Table split size settings

MaxCompute allows you to configure a split size for tables to control the job parallelism. If resources are sufficient but jobs run at a low speed or if jobs take a large amount of time to wait for resource allocation and resources are insufficient, you can adjust the split size to improve the computing efficiency.

2022-07-14

All regions

SELECT syntax

Addition and performance tuning of window functions

The FIRST_VALUE, LAST_VALUE, and NTH_VALUE window functions are added. Performance tuning of all window functions is improved to significantly increase the computing performance of window functions.

2022-07-14

All regions

Window functions

Addition of aggregate functions

The following aggregate functions are added: BITWISE_OR_AGG, MAP_AGG, MULTIMAP_AGG, MAP_UNION, MAP_UNION_SUM, and HISTOGRAM. You can use these aggregate functions to aggregate the input bits or maps to facilitate data analysis and statistics.

2022-07-14

All regions

Aggregate functions

Updates in June 2022

Feature

Description

Release date

Region

References

Exclusive resource quota for MaxCompute subscription projects

An exclusive resource quota can be created for MaxCompute subscription projects. This way, compute units (CUs) of this quota cannot be occupied by jobs that use other quotas even if some CUs of the quota are idle. You can configure an exclusive resource quota for services such as business intelligence (BI) and ALGO because these services may use CUs of a quota at any time. If some CUs of a quota are idle, the idle CUs may be occupied by jobs that use another quota in which the value of Maximum Reserved CUs is greater than the value of Minimum Reserved CUs. To prevent CUs of a quota from being occupied by jobs that use other quotas for a long period of time, you can configure an exclusive resource quota.

2022-06-27

All regions

Use MaxCompute Management

Maximum number of parallel CUs for a job that uses a subscription quota

The maximum number of parallel CUs that are used in a single job can be configured. If a MaxCompute job that uses a subscription quota occupies CUs for a long period of time, other jobs may keep waiting for CUs. This feature helps prevent a MaxCompute job that uses a subscription quota from occupying CUs for a long period of time. Before you configure the maximum number of parallel CUs that are used in a single job, you need to evaluate the parallelism based on your business requirements. If you set the parallelism to an excessively small value and an excessively small number of jobs exist, the jobs may run at a low speed but some CUs of the quota are not occupied.

2022-06-27

All regions

Use MaxCompute Management

Updates in May 2022

Feature

Description

Release date

Region

References

Separate charging for MaxCompute external tables based on the types of external tables

MaxCompute external tables are separately charged based on the types of external tables. You can view the fees that are incurred by OSS external tables and Tablestore external tables in your bills. This way, you can view the fees that are incurred by different data sources for joint computing.

2022-05-17

All regions

View billing details

Updates in March 2022

Feature

Description

Release date

Region

References

DISTRIBUTED MAPJOIN

In specific scenarios, DISTRIBUTED MAPJOIN hints can be used to improve the computing performance and reduce the time required for data computation.

2022-03-17

All regions

DISTRIBUTED MAPJOIN

Enhancement of OSS external tables

When MaxCompute writes data to an OSS external table, MaxCompute can automatically create a directory to store the data. When you create an OSS external table, you can specify the cache capacity for reading files.

2022-03-17

All regions

Create an OSS external table

New parsing method of JSON data

MaxCompute allows you to enclose a key in JSON data whose value contains a period (.) with brackets and single quotation marks (['']) to parse the JSON data.

2022-03-17

All regions

GET_JSON_OBJECT_TUPLE and JSON_TUPLE

Enhancement of the TRIM, LTRIM, and RTRIM functions

MaxCompute allows you to use the TRIM, LTRIM, and RTRIM functions to remove specified characters from the left side, the right side, or both sides of a string.

2022-03-17

All regions

String functions

Enhancement of the query rewrite feature for materialized views

The query rewrite feature can be enabled for a materialized view to help you rewrite a query statement that includes an OUTER JOIN clause, a UNION clause, or a UNION All clause.

2022-03-17

All regions

Materialized view operations

Skipping of headers or footers of TEXTFILE files

The skip.header.line.count and skip.footer.line.count parameters can be used to skip the first and last rows of a CSV file during data processing in MaxCompute. The parameters can also be used if a CSV file is compressed in the .gz, .bz2, or .lzo format.

2022-03-01

All regions

Create an OSS external table

Compatibility with Apache Spark 3.1

MaxCompute is compatible with Apache Spark 3.1 in addition to the following Apache Spark versions: 1.6, 2.3, and 2.4.

2022-03-01

All regions

Set up a Linux development environment

Updates in February 2022

Feature

Description

Release date

Region

References

Data security management of Logview

A custom parameter can be used to specify whether to display the execution results of jobs in Logview of MaxCompute. This enhances data security.

2022-02-25

All regions

Project operations

Table schema changes

The schema of a table in MaxCompute can be changed. You can add fields of complex data types to a table, delete fields from a table, and change the sequence of fields in a table.

2022-02-23

All regions

Partition and column operations

Updates in January 2022

Feature

Description

Release date

Region

References

Display of MaxCompute external project metadata on DataWorks Data Map

The metadata of MaxCompute external projects can be displayed on DataWorks Data Map.

2022-01-10

Germany (Frankfurt) and Singapore

Lakehouse of MaxCompute