This topic describes the release notes for DataWorks and provides links to the relevant references.
2024
2024-10
Feature | Description | Release date | Region | Scope | References |
Image management | Custom images can be created as permanent images in DataWorks. This way, the same image environment can be used each time you run a task on a node, which frees you from repeatedly deploying an image environment. This ensures the consistency of the runtime environment and reduces task running duration, computing costs, and traffic costs. | 2024.10.18 | China (Beijing), China (Shanghai), China (Shenzhen), China (Hangzhou), China (Hong Kong), China (Zhangjiakou), Singapore, Malaysia (Kuala Lumpur), Indonesia (Jakarta), Japan (Tokyo), Germany (Frankfurt), UK (London), US (Silicon Valley), and US (Virginia) | All DataWorks users | |
Support for serverless synchronization tasks | Serverless synchronization tasks are supported by Data Integration. You do not need to configure a resource group for a serverless synchronization task. This allows you to focus only on your business. | 2024.10.12 | China (Hangzhou), China (Shanghai), China (Beijing), China (Zhangjiakou), China (Shenzhen), China (Hong Kong), UK (London), US (Silicon Valley), US (Virginia), Japan (Tokyo), Germany (Frankfurt), and Malaysia (Kuala Lumpur) | All DataWorks users |
2024-09
Feature | Description | Release date | Region | Scope | References |
New type of real-time synchronization task | Real-time synchronization of data from a single Logstore of Simple Log Service to Delta Lake Formation (DLF) 2.0 is supported. The data is written to DLF 2.0 in the Paimon format, and simple data processing is supported during data synchronization. | 2024.9.13 | All regions | All DataWorks users | - |
Auto resource scaling during the running of a real-time synchronization task | Resources configured for a real-time synchronization task can be automatically scaled during the running of the task. You do not need to stop the task. You need to only configure an adjustment plan. The system adjusts resources for the task based on the adjustment plan during the running of the task. | 2024.9.13 | All regions | All DataWorks users | - |
2024-08
Feature | Description | Release date | Region | Scope | References |
New type of real-time synchronization task | Real-time synchronization of data from a MySQL database to SelectDB or Apache Doris is supported. | 2024.8.29 | All regions | All DataWorks users | Synchronize all data in a MySQL database to SelectDB in real time |
Permission management on Hologres | DataWorks Security Center allows you to manage the permissions for users to access Hologres data. You can configure an authorized identity, request permissions on Hologres tables, process permission requests, and view permission request records and request processing records. | 2024.8.22 | All regions | All DataWorks users | |
Lower permission requirement on the default access identity used to access a MaxCompute data source | The permission requirement on the default access identity used to access a MaxCompute data source is lowered. If you want to set the Default Access Identity parameter to Alibaba Cloud RAM User when you add a MaxCompute data source, you must make sure that the related RAM user has permissions of the Admin or Super_Administrator role of the related MaxCompute project. Before the permission requirement is lowered, the related RAM user must be attached the AdministratorAccess policy. | 2024.8.8 | All regions | All DataWorks users | |
Support for CloudSSO in DataWorks Enterprise Edition | CloudSSO is supported in DataWorks Enterprise Edition. CloudSSO allows you to use a third-party or self-managed identity provider (IdP) to log on to the Alibaba Cloud Management Console to use DataWorks. | 2024.8.8 | All regions | All DataWorks users |
2024-07
Feature | Description | Release date | Region | Scope | References |
RAM policy document update | The | 2024.7.10 | All regions | All DataWorks users | |
Update on features related to the registration of E-MapReduce (EMR) clusters | The following features are supported for the registration of an EMR cluster to DataWorks:
| 2024.7.10 | China (Zhangjiakou), which is the only region that supports EMR Serverless Spark | All DataWorks users | |
New node type in DataStudio | CDH Spark SQL nodes are supported in DataStudio. A CDH Spark SQL node can be used to develop and periodically schedule CDH Spark SQL tasks and integrate the tasks with other types of tasks. | 2024.7.10 | All regions | All DataWorks users |
2024-06
Feature | Description | Release date | Region | Scope | References |
New synchronization link in Data Integration | All data in a MySQL database can be synchronized to StarRocks in offline mode or in real time by using Data Integration. | 2024.06.28 | All regions | All DataWorks users | |
Creation of data push nodes in DataStudio | In a workflow in DataStudio, data push nodes can be created and configured as descendant nodes of nodes that are used to process data and generate data tables, and then the data push nodes can be used to periodically push the data generated by ancestor nodes to DingTalk or Lark groups in the form of a message card. Note To use data push nodes, you must submit a ticket to contact technical support to upgrade the resource groups for scheduling. | 2024.6.28 |
| All DataWorks users | |
Release of serverless resource groups | To facilitate the management of resources in DataWorks and improve user experience, serverless resource groups are introduced in DataWorks. A serverless resource group can implement the core features of an exclusive resource group for scheduling, an exclusive resource group for Data Integration, and an exclusive resource group for DataService Studio at the same time. Operations such as data synchronization, task scheduling and running, and API calling and management can be performed by using only one serverless resource group. | 2024.6.11 |
| All DataWorks users | |
New best practice for task development based on Lindorm Distributed Processing System (LDPS) | LDPS is compatible with CDH. Operations such as interactive SQL queries, SQL task development, and JAR task execution can be performed in DataWorks based on LDPS after you register a CDH cluster to DataWorks and configure LDPS connection information. | 2024.6.5 | All regions | All DataWorks users | |
New data source type for data synchronization | Azure Blob Storage data sources are supported for data synchronization. | 2024.6.3 | All regions | All DataWorks users |
2024-05
Feature | Description | Release date | Region | Scope | References |
Support for reading MySQL binary logs from Object Storage Service (OSS) | MySQL binary logs can be read from OSS. When you add a MySQL data source, you can turn on Enable Binary Log Reading from OSS if you set the Configuration Mode parameter to Alibaba Cloud Instance Mode and set the Region parameter to the region in which the current DataWorks workspace resides. After you turn on this switch, DataWorks attempts to obtain binary logs from Object Storage Service (OSS) when it cannot read binary logs from ApsaraDB RDS for MySQL. This prevents real-time synchronization tasks from being interrupted. | 2024.5.24 | All regions | All DataWorks users | |
Update on the Data Quality service | The Data Quality service is updated. After the update, a specific range of data in a table can be checked based on monitoring rules. This helps optimize the process of data quality monitoring. | 2024.5.21 | The new version of Data Quality will be released in phases. You can view the regions where the new version is supported in the DataWorks console. If the features of the new version of Data Quality are unavailable in the region where your business is located, see Data Quality of the previous version. | All DataWorks users | |
New synchronization link in Data Integration | All data in a Hologres database can be synchronized to another Hologres database in offline mode. | 2024.5.20 | All regions | All DataWorks users | Synchronize all data in a Hologres database to another Hologres database in offline mode |
Support for remote access to a host and triggering of script running on the host by DataWorks | An SSH node can be created and used based on a specific SSH data source in DataWorks to remotely access a host that is connected to the data source and trigger script running on the host. | 2024.5.15 | All regions | All DataWorks users | |
Support for EMR Kyuubi nodes in DataStudio | EMR Kyuubi nodes are supported in DataStudio. EMR Kyuubi nodes can be used to develop and periodically schedule Kyuubi tasks and integrate Kyuubi tasks with other types of tasks. | 2024.5.11 | All regions | All DataWorks users | |
New types of database nodes in DataStudio | Multiple types of database nodes are supported in DataStudio, such as DRDS nodes, PolarDB for MySQL nodes, and Doris nodes. These types of nodes can be used to develop and periodically schedule the related types of tasks and integrate the tasks with other types of tasks. | 2024.5.11 | All regions | All DataWorks users |
2024-04
Feature | Description | Release date | Region | Scope | References |
SSL authentication configuration during the addition of a PostgreSQL data source | SSL authentication can be configured when a PostgreSQL data source is added for data synchronization. | 2024.4.26 | All regions | All DataWorks users | |
Support for Hologres data sources in Data Governance Center | Hologres data sources are supported in Data Governance Center. Before you can use a Hologres data source in Data Governance Center, you must collect metadata of Hologres in Data Map. For more information, see Metadata collection. | 2024.4.24 | Hologres data sources support Data Governance Center only in the following regions: China (Beijing), China (Shanghai), China (Hangzhou), and China (Shenzhen). | All DataWorks users | |
Support for materialized views in Data Governance Center | DataWorks supports automated governance of materialized views based on intelligent recommendations. This is an intelligent and automated solution for frequent big data computing tasks that contain a large number of similar subqueries. If you enable the intelligent recommendation feature on materialized views, DataWorks can automatically identify and classify similar subqueries in MaxCompute and generate recommendations for creating materialized views. You can create a materialized view with a few clicks based on your business requirements. This significantly improves computing efficiency and saves computing resources. | 2024.4.12 | All regions | All DataWorks users | |
Support for mapping between an Alibaba Cloud account or a RAM user and an OpenLDAP account of a CDH or Cloudera Data Platform (CDP) cluster | When you register a CDH or CDP cluster to DataWorks, a mapping can be configured between an Alibaba Cloud account or a RAM user and an OpenLDAP account of a CDH or CDP cluster based on your business requirements. After the mapping is configured, the CDH or CDP tasks that are submitted by the Alibaba Cloud account or the RAM user are run by the mapped OpenLDAP account. If you want to isolate permissions on the data that can be accessed by using different Alibaba Cloud accounts or RAM users in a CDH cluster, you can use the OpenLDAP account mapping type. | 2024.4.8 | China (Beijing), China (Shanghai), China (Hangzhou), China (Shenzhen), China (Zhangjiakou), and China (Chengdu) | All DataWorks users | Configure mappings between tenant member accounts and CDH or CDP cluster accounts |
2024-03
Feature | Description | Release date | Region | Scope | References |
New version of the data backfill feature | A new version of the data backfill feature is released. After an auto triggered task is developed, committed, and deployed, the auto triggered task is run based on the scheduling configurations. If you want to run the auto triggered task in a specified time range, you can backfill data for the task. Data of a historical or future period of time can be backfilled for an auto triggered task to write the data to time-based partitions. The new version supports the following data backfill methods: | 2024.3.28 | All regions | All DataWorks users | Backfill data and view data backfill instances (new version) |
Development and deployment of extensions based on Function Compute in Open Platform | Custom event message logic can be configured for an extension in DataWorks to manage user behavior, such as intercepting or blocking improper behavior. If Function Compute is used to develop and deploy extensions, specific event messages are automatically sent to the related Function Compute service. Take note of the following items:
| 2024.3.19 |
| Users of DataWorks Enterprise Edition | |
Support for custom publishing policies for models in Data Modeling | A publishing policy can be defined for a model based on your business requirements in Data Modeling. After you enable a publishing policy, you can select a publishing mode that meets your business requirements based on the configurations of the policy when you publish a model. | 2024.3.12 | All regions | DataWorks users who activate the Data Modeling service |
2024-02
Feature | Description | Release date | Region | Scope | References |
Addition of usage notes for the development of CDP or CDH tasks in DataWorks | Usage notes for the development of CDP or CDH tasks in DataWorks are added. The usage notes cover the basic development process, fee description, environment preparation, and permission management. | 2024.2.21 | All regions | All DataWorks users | Usage notes for development of CDP or CDH tasks in DataWorks |
Support for StarRocks data sources added in Alibaba Cloud instance mode in DataService Studio | After an EMR Serverless StarRocks cluster is created, the cluster can be added in Alibaba Cloud instance mode to DataWorks as a StarRocks data source. The data source can be quickly encapsulated into an API in DataWorks DataService Studio to achieve data sharing and openness. | 2024.2.20 | All regions | All DataWorks users | |
Search feature for data development code in Data Map | The search feature for data development code is supported in DataWorks Data Map. This feature can be used to search for data development code across workspaces and locate the desired code based on keywords. This helps improve development efficiency and reduce project redundancy. | 2024.2.20 | All regions | Users of DataWorks Standard Edition and more advanced editions | |
Support for data upload and download feature | The data upload and download feature is supported in DataWorks. On-premises CSV files and OSS objects can be uploaded to MaxCompute for processing and analysis. The list of uploaded files and the list of files downloaded by services, such as DataWorks DataAnalysis, can also be managed. | 2024.2.20 | All regions | All DataWorks users | |
Support for CDH-related nodes in DataStudio | CDH-related nodes, such as CDH Hive, CDH Spark, CDH MR, CDH Presto, and CDH Impala nodes, are supported in DataStudio. The nodes can be used to develop and periodically schedule CDH-related tasks. | 2024.2.19 | All regions | All DataWorks users | |
New version of the System Configuration page in Data Security Guard | The following operations can be performed on the System Configuration page:
The preceding operations can help you identify and resolve potential security risks at the earliest opportunity. | 2024.2.6 | All regions | All DataWorks users |
2024-01
Feature | Description | Release date | Region | Scope | References |
Display of masked query results in DataStudio and DataAnalysis | Categorization and sensitivity level classification, sensitive data identification, and display of masked query results for data in EMR tables are supported in Data Security Guard. If the results obtained after you execute an SQL statement in DataStudio or DataAnalysis to query data contain sensitive data, the system automatically masks or encrypts the sensitive data based on specific data masking rules and returns the masked query results. This helps improve enterprise data security. | 2024.1.25 | All regions | All DataWorks users | |
Display of data lineages involved in real-time synchronization links in Data Map | Data lineages involved in the following real-time synchronization links can be parsed and displayed in Data Map:
Combination analysis of real-time synchronization lineages and batch synchronization lineages can help you comprehensively understand the data forwarding situation. | 2024.1.15 | All regions | All DataWorks users |
2023
2023-12
Feature | Description | Release date | Region | Scope | References |
Association of data sources with DataStudio | Data sources or clusters can be associated with DataStudio. After the association, you can use the data sources or clusters to perform data modeling or to periodically schedule tasks in Operation Center. You can also read data in the data sources or clusters and perform data development operations. | 2023.12.29 | All regions | All DataWorks users | Preparations before data development: Associate a data source or a cluster with DataStudio |
New version of data sources | MaxCompute, Hologres, AnalyticDB for PostgreSQL, AnalyticDB for MySQL, and ClickHouse compute engines are managed as data sources, and EMR and CDH or CDP compute engines are managed as open source clusters. This can help improve user experience. After the change, operations that are related to compute engines, such as creating and modifying compute engines, are performed on the Data Sources or Open Source Clusters page in Management Center of the DataWorks console. | 2023.12.29 | All regions | All DataWorks users | |
New extension point events | The following extension point events are added in Open Platform:
| 2023.12.27 | All regions | All DataWorks users | |
New application scopes of extension point events | The following application scopes of extension point events are added:
When you register an extension, you can select only a single type of extension point event. | 2023.12.22 |
| All DataWorks users | |
New check items for SQL efficiency optimization in Data Governance Center | Five check items are added in Data Governance Center. The check items include DescartesChecker for MaxCompute, EMR Hive, and EMR Spark SQL tasks, MasterTableOnConditionChecker, and Force Scan Checker. The check items can help you perform pre-event checks and timely optimization in the R&D stage, improve computing efficiency, reduce the waste of a large number of computing resources, and ensure the timeliness of data output. | 2023.12.22 | All regions | All DataWorks users | |
Full support for StarRocks data sources | StarRocks data sources are fully supported in the following services of DataWorks:
| 2023.12.15 | All regions | All DataWorks users | |
Support for new EMR Hadoop cluster versions | The following EMR Hadoop cluster versions are supported in DataWorks:
| 2023.12.15 | All regions | All DataWorks users | |
New check items for Check nodes in DataStudio | New check items are supported for Check nodes in DataStudio. You can use a Check node to check the availability of MaxCompute partitioned tables, FTP files, and OSS objects based on check policies. If the condition that is specified in the check policy for a Check node is met, the task on the Check node is successfully run. If the running of a task depends on an item, you can use a Check node to check the availability of the item and configure the task as a descendant task of the Check node. If the condition that is specified in the check policy for the Check node is met, the task on the Check node is successfully run and then its descendant task is triggered to run. | 2023.12.08 | All regions | All DataWorks users | |
Support for PAI DLC nodes in DataStudio | PAI DLC nodes are supported in DataStudio. PAI DLC nodes can be used to periodically schedule DLC tasks. | 2023.12.08 | All regions | All DataWorks users | |
Support for risk identification rules in Security Center | Risk identification rules are supported in Security Center. Security Center allows an administrator to register risk identification capabilities to DataWorks as extensions. This way, the extensions can be used as risk identification rules to identify risks in user operations. You can use a default or custom risk identification rule to identify risks on data download operations and configure a blocking or approval response policy based on your business requirements. | 2023.12.08 | All regions | All DataWorks users |
2023-11
Feature | Description | Release date | Region | Scope | References |
Support for Check nodes in DataStudio | Check nodes are supported in DataStudio. You can use a Check node to check whether a specific partition exists in a MaxCompute partitioned table or whether data is written to the partition. If a task depends on a MaxCompute partitioned table, you can use a Check node to check whether the partition data in the table is available first. This prevents invalid data from being used. | 2023.11.20 |
| All DataWorks users |
2023-08
Feature | Description | Release date | Region | Scope | References |
Support for specifying a scheduling cycle | The scheduling calendar feature is supported. You can specify a scheduling cycle by marking dates on a scheduling calendar as scheduling days or non-scheduling days. | 2023.08.24 | All regions | Users of DataWorks Enterprise Edition | |
Support for governance along the data lake development link in Data Governance Center | Proactive governance is supported for issues identified along the data lake development link, which consists of EMR, Data Lake Formation (DLF), and DataWorks. The following governance capabilities are supported:
| 2023.08.24 |
| Users of DataWorks Enterprise Edition or a more advanced edition |
2023-06
Feature | Description | Release date | Region | Scope | References |
Real-time synchronization from Kafka to Hologres to implement the extract, transform, load (ETL) process |
| 2023.06.01 | All regions | All DataWorks users | |
Real-time synchronization of all data from a MySQL database to an OSS data lake in the Hudi format | All data in a MySQL database can be synchronized to an OSS data lake in real time. The data is written to the data lake in the Hudi format. The following capabilities are supported:
| 2023.06.01 | All regions | All DataWorks users | |
Support for Amazon Relational Database Service (Amazon RDS) data sources for data synchronization | An Amazon RDS data source can be added for data synchronization in the same way as a MySQL data source. An Amazon RDS data source provides the same capabilities as a MySQL data source. | 2023.06.01 | All regions | All DataWorks users |
2023-04
Feature | Description | Release date | Region | Scope | References |
Support for saving data analysis results as MaxCompute tables | Data analysis results can be directly saved as MaxCompute tables for subsequent queries or joint analysis, without the need to run code to create tables for saving the data analysis results. | 2023.4.20 | All regions | All DataWorks users | |
Support for downloading millions of SQL query result records in DataAnalysis | By default, a maximum of 10,000 SQL query result records can be downloaded. Administrators can modify the upper limit for different editions in Security Center: 200,000 for DataWorks Standard Edition, 2,000,000 for DataWorks Professional Edition, and 5,000,000 for DataWorks Enterprise Edition and DataWorks Ultimate Edition. The download feature can be disabled. | 2023.4.18 | All regions | All DataWorks users | |
Launch of public datasets for big data services | Terabytes of data can be quickly analyzed by DataWorks and MaxCompute based on public datasets for big data and AI services in different platforms, such as Taobao, Fliggy, Ali Music, GitHub, and TPC. | 2023.4.11 | All regions | All DataWorks users |
2023-03
Feature | Description | Release date | Region | Scope | References |
Support for notifying governance issues in Data Governance Center | Notifications can be configured for daily governance issues by administrators and individual users. This way, the system can send the notifications to the related engineers by system message, email, DingTalk group message, or webhook URL. This facilitates the handling of the governance issues. | 2023.3.15 | All regions | All DataWorks users | |
Support for the long-lifecycle governance item in the storage dimension in Data Governance Center | The long-lifecycle governance item is supported in the storage dimension in Data Governance Center. The governance item can help users specify an appropriate lifecycle for MaxCompute partitioned tables to reduce the waste of storage resources. | 2023.3.15 | All regions | All DataWorks users | |
Commercialization of Acceleration Service provided by DataService Studio | The Acceleration Service solution is introduced in DataService Studio. You can use the solution to create an online API to accelerate the query of MaxCompute data without exporting data from MaxCompute. This improves the query performance and efficiency and meets online query requirements. | 2023.3.1 | China (Shanghai), China (Beijing), China (Hangzhou), and China (Shenzhen) | All DataWorks users |
2023-01
Feature | Description | Release date | Region | Scope | References |
Support for managing purchased resources in DataWorks |
All resources that are not released can be displayed in DataWorks. This way, you can perform operations such as upgrading or downgrading specifications, applying for refunds, and renewal on the resources in an efficient manner. | 2023.1.11 | All regions | All DataWorks users | |
Support for graceful undeployment of multiple tasks at a time in Data Governance Center | The following features are supported:
| 2023.1.9 | All regions | All DataWorks users | |
Support for code review in DataStudio in a workspace in basic mode |
The code review feature is supported in DataStudio in a workspace in basic mode. If the forcible code review feature is enabled, the code of a node can take effect in the production environment only after the code of the node passes the code review. | 2023.1.5 | All regions | All DataWorks users |
2022
2022-11
Feature | Description | Release date | Region | Scope | References |
Support for data source-oriented API encapsulation in the development and production environments in DataService Studio | The following features are supported in a workspace in standard mode:
| 2022.11.29 | All regions | All DataWorks users | |
Support for requesting permissions on Hive tables in Data Map | The Request Permissions button is added on the details page of an EMR Hive table in Data Map. You can click this button to request permissions on the table in Security Center. | 2022.11.29 | All regions | All DataWorks users | |
Support for data albums in Data Map | The Data Album page is added in Data Map. The following features are provided:
| 2022.11.16 | All regions | All DataWorks users | |
Brand-new SQL query experience in the upgraded DataAnalysis service | The following features are supported in the upgraded DataAnalysis service:
| 2022.11.15 | All regions | All DataWorks users | |
Support for parsing request and response parameters in an API that is created by using the advanced SQL syntax in script mode in DataService Studio | The following features are supported in DataService Studio:
| 2022.11.10 | All regions | All DataWorks users | None |
2022-10
Feature | Description | Release date | Region | Scope | References |
Support for the EMR Hive compute engine in Data Modeling | The following features are supported by the Dimensional Modeling module in Data Modeling. The features enable Data Modeling to provide the same modeling capabilities as MaxCompute.
| 2022.11.25 | All regions | All DataWorks users | |
Support for version management of models in Data Modeling | The following features are supported by the Dimensional Modeling module in Data Modeling.
| 2022.11.25 | All regions | All DataWorks users | |
Display of API call addresses generated based on domain names on the API details page in DataService Studio | The call addresses that are separately generated for an API based on the Internet domain name, VPC domain name, and independent domain name can be displayed on the details page of the API. You can select an address to call the API based on your business requirements. | 2022.10.21 | All regions | All DataWorks users | |
Upgrade of the lineage feature in Data Map | The lineage feature of Data Map is upgraded to provide a better user experience in data lineage analysis. On the lineage details tab, you can perform the following operations:
| 2022.10.21 | All regions | All DataWorks users | |
Support for new check items in the R&D dimension in Data Governance Center | The following types of check items are added in the R&D dimension in Data Governance Center:
The following features are provided:
| 2022.10.20 | All regions | All DataWorks users | |
Support for code review in DataStudio in a workspace in basic mode | If the forcible code review feature is enabled in a workspace in basic mode, the code of a node can take effect in the production environment only after the code of the node passes the code review. | 2022.9.22 | All regions | All DataWorks users |
2022-8
Feature | Description | Release date | Region | Scope | References |
Task management from the workflow perspective in Operation Center | In Operation Center, the status of tasks can be viewed and operations such as rerunning, freezing, and terminating tasks can be performed from the workflow perspective. | 2022.8.22 | All regions | All DataWorks users | View and manage auto triggered instances from the workflow perspective |
MaxCompute data source-oriented query acceleration in DataService Studio | An online API can be created in DataService Studio by using an acceleration solution to accelerate the query of MaxCompute data, without the need to export data from MaxCompute. This improves the query performance and efficiency and meets online query requirements. The following acceleration solutions are provided:
| 2022.8.17 | China (Shanghai) and China (Shenzhen) | All DataWorks users | |
Intelligent diagnostics and analysis of an API call link in DataService Studio | API call logs can be analyzed in DataService Studio. You can use the log analysis feature to analyze the link of a single API call request. If the API call request fails, you can use this feature to troubleshoot issues at the earliest opportunity and obtain diagnostic results and suggestions. | 2022.8.7 | All regions | DataWorks users | |
Fine-grained permission management at the project and table levels in Data Map | Various policies can be configured to manage permissions on metadata at different granularities in Data Map.
| 2022.8.5 | All regions | DataWorks users | |
Support for creation of a batch synchronization task by using the codeless UI to synchronize data from or to a Dameng database in Data Integration | A batch synchronization task can be created by using the codeless UI to synchronize data from or to a Dameng database in Data Integration. The codeless UI is more convenient than the code editor. | 2022.8.2 | All regions | DataWorks users | Configure a batch synchronization task by using the codeless UI |
2022-7
Feature | Description | Release date | Region | Scope | References |
Support for dimensional modeling in Data Modeling | The following features are supported in Data Modeling:
| 2022.7.29 | China (Hangzhou), China (Shanghai), China (Beijing), China (Zhangjiakou), China (Shenzhen), China (Chengdu), China (Hong Kong), Singapore, China East 2 Finance, China South 1 Finance, China North 2 Ali Gov, Germany (Frankfurt), and US (Silicon Valley) | All DataWorks users | |
Support for viewing information about associated fields in Data Modeling | The following operations can be performed to view the information about associated fields: Go to the configuration tab of a derived metric or an atomic metric. Click Associate Tables in the right-side navigation pane to view the names of the fields that are associated with the current metric. You can also go to the details page of the table to which an associated field belongs to manage the association. | 2022.7.29 | All DataWorks users | ||
Support for the configuration of a naming rule checker for tables and derived metrics in Data Modeling | A checker at a data layer can be configured to define a naming convention and unify the naming formats of tables and derived metrics at the data layer. When you design tables and derived metrics, the checker can constrain and verify entity names to improve naming compliance throughout the development process. Configuration for rules defined in checkers:
| 2022.7.29 | All DataWorks users | ||
Support for the configuration of exclusive resource groups used by tasks of different compute engine types in DataAnalysis | An Alibaba Cloud account can be used to configure the exclusive resource groups used by tasks of different compute engine types on the System Management page in DataAnalysis. You can perform SQL queries on a specific exclusive resource group. | 2022.7.29 | China (Hangzhou), China (Shanghai), China (Beijing), China (Zhangjiakou), China (Shenzhen), China (Chengdu), and Germany (Frankfurt) | All DataWorks users | |
Support for synchronization of data in PostgreSQL databases | Synchronization of data in PostgreSQL databases is supported. Two-factor authentication based on the | 2022.7.26 | China (Hangzhou), China (Shanghai), China (Beijing), China (Zhangjiakou), China (Shenzhen), China (Chengdu), China (Hong Kong), Japan (Tokyo), Singapore, Australia (Sydney) Closing Down, Malaysia (Kuala Lumpur), Indonesia (Jakarta), Germany (Frankfurt), UK (London), US (Silicon Valley), US (Virginia), and UAE (Dubai) | All DataWorks users | |
Support for EMR DataLake clusters | EMR DataLake clusters can be used as compute engines in DataWorks. The following full-lifecycle capabilities that work based on an EMR DataLake compute engine can be implemented: data synchronization, data modeling, data development and scheduling, data quality monitoring, data map, data security, data analysis (related tasks must be run on exclusive resource groups), and data services. | 2022.7.8 | China (Chengdu), China (Zhangjiakou), China (Shenzhen), China (Beijing), China (Shanghai), China (Hangzhou), China (Hong Kong), Japan (Tokyo), Germany (Frankfurt), US (Virginia), US (Silicon Valley), Indonesia (Jakarta), UK (London), Singapore, Malaysia (Kuala Lumpur), and UAE (Dubai) | All DataWorks users | |
Support for field insertion in a visualized manner and verification of permissions on tables by using the intelligent code editor in DataStudio |
| 2022.7.2 | China (Hangzhou), China (Shanghai), China (Beijing), China (Zhangjiakou), China (Shenzhen), China (Chengdu), and China (Hong Kong) | All DataWorks users |
2022-6
Feature | Description | Release date | Region | Scope | References |
Support for Data Governance Center | Data Governance Center is available and provides the following features:
Note
| 2022.6.27 | China (Hangzhou), China (Shanghai), China (Beijing), China (Shenzhen), China (Chengdu), Singapore, and US (Silicon Valley) | All DataWorks users | |
Support for a panoramic view of a task in Data Governance Center | A panoramic view of a task is provided on the Task 360 page. You can view the following information about a task on the page: governance issues that are identified on the task, operation records of the task, baselines that are affected by the task, and task execution information. The information helps you perform data governance operations on the task. | 2022.6.24 | China (Hangzhou), China (Shanghai), China (Beijing), China (Shenzhen), China (Chengdu), Singapore, and US (Silicon Valley) | All DataWorks users | |
Support for search and creation of views in Data Modeling |
| 2022.6.22 | China (Hangzhou), China (Shanghai), China (Beijing), China (Zhangjiakou), China (Shenzhen), China (Chengdu), China (Hong Kong), Singapore, US (Silicon Valley), Germany (Frankfurt), China East 2 Finance, China South 1 Finance, and China North 2 Ali Gov 1 | All DataWorks users | |
Support for generation of models based on table name keywords in Data Modeling | The reverse modeling feature can be used to generate logical models based on fuzzy match of table name keywords. | 2022.6.19 | China (Hangzhou), China (Shanghai), China (Beijing), China (Zhangjiakou), China (Shenzhen), China (Chengdu), China (Hong Kong), Singapore, US (Silicon Valley), Germany (Frankfurt), China East 2 Finance, China South 1 Finance, and China North 2 Ali Gov 1 | All DataWorks users | |
Support for management of operations performed on data synchronization tasks in Approval Center | Request processing policies for data synchronization tasks can be configured in Approval Center to ensure the security of data during data transmission. You can use a combination of a source and a destination to specify a data synchronization task on which an operation request must be processed. For example, if a data synchronization task is saved, the related request processing procedure is triggered. This way, you can manage the data synchronization process in a flexible manner. | 2022.6.15 | China (Hangzhou), China (Shanghai), China (Beijing), China (Zhangjiakou), China (Shenzhen), China (Chengdu), China (Hong Kong), Singapore, Indonesia (Jakarta), Malaysia (Kuala Lumpur), US (Silicon Valley), US (Virginia), and Germany (Frankfurt) | All DataWorks users | |
Support for lineage graphs of sensitive data in Data Security Guard | The sensitive data lineage graph feature is supported. The feature supports the following sub-features:
Note This feature is available only in DataWorks Enterprise Edition. | 2022.6.14 | China (Hangzhou) and China (Shanghai) | All DataWorks users | |
Support for analysis of abnormal lineages between fields in Data Security Guard | The abnormal lineage analysis feature is supported. The feature provides the following capabilities:
| 2022.6.14 | China (Hangzhou) and China (Shanghai) | All DataWorks users |
2022-5
Feature | Description | Release date | Region | Scope | References |
New version of the risk identification rule management feature | A new version of the risk identification rule management feature is released. The new version of the feature provides built-in risk identification scenarios. Risk identification from various dimensions, such as the category and sensitivity level of data, operation method, and user permissions, is supported. Alert judgment based on the aggregation degree of alert events is supported to prevent false positive alerts. Fine-grained management for high, medium, and low-level risks is supported. This helps you identify various data risks in your enterprise in an all-around manner. Note
| 2022.5.16 | China (Hangzhou), China (Shanghai), China (Beijing), China (Zhangjiakou), China (Shenzhen), China (Chengdu), and China (Hong Kong) | All DataWorks users |
2022-4
Feature | Description | Release date | Region | Scope | References |
Optimization of the features on the DataStudio page and the display pattern of the status of nodes on the DataStudio page |
| 2022.4.7 | China (Hangzhou), China (Shanghai), China (Beijing), China (Zhangjiakou), China (Shenzhen), China (Chengdu), and China (Hong Kong) | All DataWorks users | |
Support for the rule list feature and management of multiple monitoring rules in Data Quality | The rule list feature is supported and provides the following functionalities:
| 2022.4.11 | China (Hangzhou), China (Shanghai), China (Beijing), China (Zhangjiakou), China (Shenzhen), China (Chengdu), and China (Hong Kong) | All DataWorks users | |
Support for more flexible alert settings for baselines in the intelligent baseline feature in Operation Center | The intelligent baseline feature is optimized in the following aspects:
| 2022.4.26 | China (Hangzhou), China (Shanghai), China (Beijing), China (Zhangjiakou), China (Shenzhen), China (Chengdu), and China (Hong Kong) | All DataWorks users |
2022-3
Feature | Description | Release date | Region | Scope | References |
Support for cross-workspace deployment of objects on the Deploy page and optimized management of deploy operations | Objects, such as tasks, resources, and functions, in a workspace can be deployed to another workspace. | 2022.3.2 | All regions | Users who require strong control on deploy operations, such as users in the finance sector or public service sectors | Create a deployment package to deploy objects in the deployment package across workspaces |
Integration of DataAnalysis with ActionTrail and monitoring of operation records in DataAnalysis by ActionTrail | DataAnalysis is integrated with ActionTrail and the following types of operation records can be monitored by ActionTrail:
| 2022.3.20 | All regions | All DataWorks users |
|
Optimization of the ranking feature for check items and governance items in Data Governance Center | The ranking feature is optimized in the following aspects:
| 2022.3.21 | All regions | Users who participate in the invitational preview of Data Governance Center | |
Configuration optimization of synchronization tasks in Data Integration | More than 1,000 tables can be synchronized when a real-time synchronization task is run to synchronize data to MaxCompute or a real-time synchronization task is run to synchronize data to Hologres on an exclusive resource group for Data Integration. This improves the efficiency of data synchronization. | 2022.3.25 | All regions | Users who need to synchronize large amounts of data, such as users of Software as a Service (SaaS) platforms or users in the finance sector |
2021
2021-12
Feature | Description | Release date | Region | References |
Support for configuration of a monitoring rule for multiple tables at the same time based on a rule template in Data Quality | A rule template can be selected to configure a monitoring rule for multiple tables at the same time. This simplifies the configuration.
| 2021.12.14 | All regions | Configure a monitoring rule for multiple tables based on a template |
Support for the resource usage analysis feature in Data Governance Center | The resource usage analysis feature is provided by DataWorks Data Governance Center. The feature allows you to view the overall resource consumption, resource consumption changes, and resource consumption details in the following dimensions: MaxCompute storage resource consumption, MaxCompute computing resource consumption, resource consumption of DataWorks task scheduling, and resource consumption of DataWorks batch synchronization. | 2021.12.9 | All regions |
2021-11
Feature | Description | Release date | Region | References |
Support for the resource group orchestration feature in DataStudio | The resource group orchestration feature is supported. The feature allows you to change resource groups for the scheduling of multiple nodes in a workflow at the same time. If multiple resource groups for scheduling exist in your workspace, you can change the resource groups for scheduling of nodes in the workspace based on your business requirements. This can facilitate reasonable resource usage. | 2021.11.30 | All regions | |
Support for the batch operation feature in DataStudio | Operations can be performed on multiple DataWorks objects at the same time. DataWorks allows you to modify configurations, such as the owners of multiple nodes, resources, or functions, at the same time. After the modification, you can commit and deploy the nodes, resources, or functions to the production environment for the modifications to take effect. | 2021.11.11 | All regions | Perform operations on multiple DataWorks objects at the same time |
2021-10
Feature | Description | Release date | Region | References |
Support for the reverse modeling and naming dictionary features in Data Modeling |
| 2021.10.30 | The features are in public preview in the following regions: China (Beijing), China (Shanghai), China (Hangzhou), China (Shenzhen), China (Zhangjiakou), China (Chengdu), Singapore, US (Silicon Valley), Germany (Frankfurt), China (Hong Kong), China East 2 Finance, and China South 1 Finance. | |
Support for the code search feature in DataStudio | The code search feature is supported. The feature allows you to query code snippets in the code of nodes by keyword. The search results show the details of each code snippet and the nodes whose code contains the code snippets. You can use the feature to trace the node that causes changes in a table. | 2021.10.27 | All regions |
2021-09
Feature | Description | Release date | Region | References |
Support for the display of DataService Studio APIs in Data Map | DataService Studio API assets, such as wizard APIs, script APIs, and registration APIs, can be displayed in Data Map. You can search for and manage APIs from the global perspective or based on business scenarios. In Data Map, you can perform specific operations on APIs, such as globally searching for an API, viewing statistics on popular APIs, viewing information about an API on the details page of the API, and viewing the API distribution that belongs to each data source type. | 2021.09.30 | All regions | |
Release of Data Governance Center of the latest version | Issues that must be handled in the data storage, task computing, code development, data quality, and security dimensions can be detected by Data Governance Center from the global, workspace, and personal perspectives. Data Governance Center provides health scores to evaluate the effectiveness of data governance and visualizes the governance results by providing governance reports and the rankings of governance issues. This helps you troubleshoot issues in an efficient manner and achieve governance goals. | 2021.09.12 | Data Governance Center of the latest version is in public preview in the China (Shanghai), China (Hangzhou), China (Beijing), and China (Shenzhen) regions. |
2021-07
Feature | Description | Release date | Region | References |
Release of Approval Center in Data Governance Center | The DataWorks Approval Center feature is released. You can use this feature to manage permissions on data and manage high-risk operations. You can also use this feature to specify the scope of requests and customize request processing procedures to meet the request processing requirements of your enterprise in different compliance scenarios. | 2021.07.16 | All regions | |
Support for issuing tasks to EMR gateway nodes | Parameters on the Advanced Settings tab can be configured to issue tasks to EMR gateway nodes to balance loads. You can issue workspace-level tasks to nodes in the future. | 2021.07 | All regions |
2021-08
Feature | Description | Release date | Region | References |
Exclusive resource groups for DataService Studio in the China (Hangzhou) and China (Shanghai) regions | Exclusive resource groups for DataService Studio are available in the China (Hangzhou) and China (Shanghai) regions. If high queries per second (QPS) and service level agreement (SLA) guarantees are required when you call APIs in DataService Studio, you can use exclusive resource groups for DataService Studio to ensure successful API calls. Exclusive resource groups for DataService Studio can meet the requirements of highly concurrent, frequent API calls and help ensure that responses are returned at the earliest opportunity. | 2021.08.06 | China (Hangzhou) and China (Shanghai) | |
Commercial release of DataWorks Migration Assistant | Migration Assistant can be used to migrate data development objects across different DataWorks editions, Alibaba Cloud accounts, regions, and workspaces. You can export the data objects in your workspace, including auto triggered tasks, manually triggered tasks, resources, functions, data sources, table metadata, ad hoc queries, and SQL script templates. You can also create full export tasks, incremental export tasks, or custom export tasks to export your data objects in DataWorks based on your business requirements. | 2021.08.01 | All regions |
2021-06
Feature | Description | Release date | Region | References |
Development and O&M of EMR Spark Streaming nodes | EMR Spark Streaming and EMR Streaming SQL nodes are supported in DataWorks. You can develop an EMR Spark Streaming or EMR Streaming SQL node, test the node, and then commit the node to the production environment. You can rerun the node if the node fails to run. You can also perform the following operations: view the status and details of the node, start, terminate, or undeploy the node, monitor the node, and send notifications if errors occur on the node. | 2021.06 | All regions | |
Migration of EMR data development tasks to DataWorks | Workflows (nodes and scheduling settings), manually executed jobs, resources, and data sources can be migrated from an EMR cluster to a DataWorks workspace by using Migration Assistant of DataWorks. You can go to the Migration Assistant page in the DataWorks console to view the migration progress, results, and reports. | 2021.06 | All regions | |
Support for the resource O&M feature in Operation Center | The resource O&M feature is supported in Operation Center. This feature can help you monitor the usage of resource groups that are used to run a node. | 2021.06.09 | All regions | |
MaxCompute data source-based API encapsulation | MaxCompute tables can be accessed and used to encapsulate APIs in DataService Studio. This feature is in canary release. Such APIs query data based on the MaxCompute Query Acceleration (MCQA) feature of MaxCompute. This helps you achieve quick and efficient API calls. You can run MaxCompute tasks only on exclusive resource groups for DataService Studio. | 2021.06 | All regions | None |
Configuration of alert contacts in DataWorks | A RAM user or a RAM role can be added as an alert contact on the Alert Contacts page of the DataWorks console. If an error occurs during the running of a task, DataWorks sends alert notifications to the alert contact that you specified. This way, you can handle exceptions at the earliest opportunity. | 2021.06 | All regions |
2021-05
Feature | Description | Release date | Region | References |
Real-time data synchronization to AnalyticDB for MySQL V3.0 | A real-time synchronization task can be created in DataWorks to synchronize data from a MySQL, OceanBase, or PolarDB database to an AnalyticDB for MySQL data source. The real-time synchronization task synchronizes full data from a database at a time and then synchronizes incremental data from the database in real time to the AnalyticDB for MySQL V3.0 data source. In addition, columns that you add to the source are automatically added to the destination during real-time synchronization. | 2021.05.25 | All regions | |
Public preview of Open Message | The Open Message service is supported in DataWorks. You can enable the message subscription feature in DataWorks Open Message. The Open Message service is in public review. Only users of DataWorks Enterprise Edition can join the public preview. If your DataWorks service is of Enterprise Edition, you can use the Open Message service on a trial basis and are not charged additional fees during the public preview. You can use Open Message to obtain metadata and task change events in DataWorks. This way, DataWorks can be deeply integrated with your system. | 2021.05.21 | China (Beijing), China (Hangzhou), China (Shenzhen), and China (Shanghai) | |
Support for scheduling tasks to run at a specified point in time on specific days every year or at a specified point in time on the last day of a month | Tasks can be scheduled to run at a specified point in time on specific days every year or at a specified point in time on the last day of a month. This way, you can schedule tasks to run at a specified point in time on the last day of every year, quarter, or month. DataWorks allows you to schedule tasks by minute, hour, day, week, month, or year. | 2021.05.19 | All regions | |
Support for ClickHouse data sources | ClickHouse data sources are supported by DataWorks. ETL operations such as data synchronization, data development, task scheduling, and task O&M related to ClickHouse data sources are allowed and management capabilities for the ETL operations are provided.
| 2021.05.15 | All regions |
2021-04
Feature | Description | Release date | Region | References |
Real-time data synchronization to AnalyticDB for MySQL V3.0 | A real-time synchronization task can be created in Data Integration to synchronize data from multiple tables to an AnalyticDB for MySQL V3.0 data source in real time. Columns that you add to a source table by executing DDL statements are automatically added to the destination table during real-time synchronization. | 2021.4.20 | All regions | Create a real-time synchronization solution to synchronize data to AnalyticDB for MySQL V3.0 |
Support for FTP Check nodes in DataStudio | An FTP Check node can be created in DataStudio to periodically detect whether a specific file exists based on FTP. If the FTP Check node detects that the file exists, the scheduling system runs the descendant node of the FTP Check node. Otherwise, the FTP Check node detects the file based on the configured detection interval. The FTP Check node stops the retry until the condition to stop the detection is met. In most cases, FTP Check nodes are used for communications between the DataWorks scheduling system and external scheduling systems. | 2021.4.15 | China (Beijing), China (Shanghai), China (Hangzhou), China (Shenzhen), China (Zhangjiakou), China (Chengdu), and Singapore |
2021-03
Feature | Description | Release date | Region | References |
Support for custom roles in workspaces of DataWorks Enterprise Edition | Custom roles are supported in DataWorks Enterprise Edition. You can grant permissions to the roles based on your business requirements. | 2021.3.22 | All regions | |
Kerberos authentication in Data Integration | Kerberos authentication is supported in Data Integration. If you want to perform identity authentication for data sources, such as Hive and Kafka, upload the files required to configure Kerberos authentication when you add the data sources. This ensures that you can access the data sources in a secure manner. | 2021.3.16 | All regions | |
Security Center of the latest version | Security Center of the latest version is released. You can use Security Center to build a security system that can secure data and personal privacy in an efficient manner. Security Center can meet various security requirements in high-risk scenarios, such as auditing. You can use Security Center without the need to perform additional configurations. | 2021.03.13 | All regions | |
Support for the node aggregation, ancestor node analysis, and descendant node analysis features in Operation Center | The node aggregation feature is supported. This feature allows you to aggregate nodes in a directed acyclic graph (DAG) from different dimensions, such as workspace, owner, or priority. This way, you can view the total number of nodes from a specified dimension. The ancestor node analysis and descendant node analysis features are also supported. The features allow you to analyze the ancestor and descendant nodes of a specific node. This way, you can quickly find the ancestor node that blocks the running of the node, view the number of the descendant nodes of the node based on the analysis results, and understand the running status of all nodes. | 2021.03.10 | China (Shenzhen) |
2021-01
Feature | Description | Release date | Region | References |
Support for RestAPI data sources in Data Integration | RestAPI data sources are supported in Data Integration. Such data sources provide or receive data by using RESTful API operations. Data Integration supports batch synchronization of data in these data sources. | 2021.1.4 | All regions |
2021-02
Feature | Description | Release date | Region | References |
Creation of multiple metadata crawlers at a time by using the data discovery feature | The data discovery feature of Data Map can be used to create multiple metadata crawlers at a time. This way, you can quickly view the table schema and associations between tables. | 2021.02.17 | All regions | |
Task migration from Airflow by using Migration Assistant | Tasks in Airflow can be migrated to DataWorks by using Migration Assistant. | 2021.02.16 | All regions | |
Support for view of API statistics in DataService Studio | API statistics can be viewed on the Statistics Dashboard and Statistics Details pages of DataService Studio. The Statistics Dashboard page of DataService Studio provides various charts and tables to show API statistics. For example, you can view the total number of APIs in a workspace and the total number of API calls. This helps you obtain information about API calls from a global perspective. On the Statistics Details page of DataService Studio, you can view the monitoring charts to obtain information about a specific API, such as API gateway status codes and DataService Studio error codes. | 2021.02.16 | China (Beijing) | |
Open Platform | The Open Platform service is available in DataWorks. This service allows you to view the metering reports of APIs and the call details on a specified date. | 2021.02.13 | All regions |
2020
2020-12
Feature | Description | Release date | Region | References |
Full and incremental data synchronization to Elasticsearch | Full and incremental data in all tables or specific tables in a database can be synchronized to Elasticsearch. | 2020.12.30 | All regions | Create a real-time synchronization solution to synchronize data to Elasticsearch |
2020-09
Feature | Description | Release date | Region | References |
Real-time synchronization in Data Integration | The real-time synchronization feature is supported in Data Integration. This feature allows you to synchronize data changes from a single table or all tables in a source database to a destination database in real time. This way, data in the destination database is consistent with data in the source database in real time. You can create a synchronization task to synchronize full and incremental data between different data sources. | 2021.4.15 | All regions |
2020-07
Feature | Description | Release date | Region | References |
Public preview of API operations | API operations of multiple modules are provided to help you use DataWorks in a flexible manner. These modules include tenants, metadata, DataStudio, Operation Center, Data Quality, and DataService Studio. Note You can use the API operations only in DataWorks Enterprise Edition or a more advanced edition. | 2020.07.16 | China (Hangzhou), China (Shanghai), China (Shenzhen), China (Beijing), and China (Zhangjiakou) | |
Public preview of Migration Assistant | You can export the data objects in your workspace, including auto triggered tasks, manually triggered tasks, resources, functions, data sources, table metadata, ad hoc queries, and SQL script templates. You can also create full export tasks, incremental export tasks, or custom export tasks to export your data objects in DataWorks based on your business requirements. | 2020.07 | China (Hangzhou), China (Shanghai), China (Beijing), China (Zhangjiakou), China (Shenzhen), China (Chengdu), and Singapore | |
Upgrade of DataService Studio | The items in the left-side navigation pane of DataService Studio are adjusted. | 2020.07.28 |
|
2020-06
Feature | Description | Release date | Region | References |
Data source query | The data source query feature is supported. When you modify a workbook, you can use this feature to read data from a data source for analysis. | 2020.06.09 | China (Shanghai) |
2020-04
Feature | Description | Release date | Region | References |
Phone call-based alerting in Operation Center | Alert notifications can be sent by phone call, text message, and email. Important You can use the phone call-based alerting feature only in DataWorks Professional Edition or a more advanced edition. | 2020.04.15 | All regions |