This topic describes the change history of DataWorks documentation. You can learn the new features and feature changes of DataWorks.

Note DataWorks can be automatically updated, and the update has no impact on existing users.

Changes in November 2020

Date Feature Type Description Product document
November 18, 2020 New feature New API operation A topic is added to describe how to call the CreateManualDag operation to manually trigger a workflow. CreateManualDag
November 18, 2020 New feature New API operation A topic is added to describe how to call the GetManualDagInstances operation to query the information about manually triggered workflow instances. GetManualDagInstances
November 18, 2020 New feature New API operation A topic is added to describe how to call the GetDag operation to query the details of the directed acyclic graph (DAG) based on the DAG ID. GetDag
November 18, 2020 New feature New API operation A topic is added to describe how to call the SearchNodesByOutput operation to query a node based on the output. SearchNodesByOutput
November 10, 2020 New FAQ Experience optimization A topic is added to provide answers to commonly asked questions about Operation Center. Operation Center
November 02, 2020 New feature New feature A topic is added to describe the code review feature in DataWorks. If the forcible code review feature is enabled, you must commit each node for the specified reviewer to review the node code. You can deploy the node only after the reviewer approves the node code. Code review

Changes in October 2020

Date Feature Type Description Product document
October 30, 2020 New API operation overview Experience optimization A topic is added to describe the applicable scopes, billing methods, and call limits of DataWorks API operations. Overview
October 30, 2020 New feature New feature A topic is added to describe how to configure SQLServer Change Data Capture (CDC) Reader. After the CDC feature of SQL Server is enabled, SQLServer CDC Reader allows Data Integration to monitor and collect logs about inserting, updating, and deleting tables in SQL Server. Then, Data Integration converts the logs to real-time messages. This way, data is synchronized from SQL Server in real time. Configure SQLServer CDC Reader
October 28, 2020 New feature New feature A topic is updated to describe how to create an E-MapReduce table. Create an E-MapReduce table
October 28, 2020 New feature New feature A topic is added to describe how to use E-MapReduce in DataWorks. In DataWorks, you can create nodes such as Hive, E-MapReduce MR, Presto, and Spark SQL nodes based on an E-MapReduce compute engine and configure E-MapReduce workflows. You can also schedule the nodes and manage metadata. This helps E-MapReduce users better produce data. E-MapReduce access modes

Changes in September 2020

Date Feature Type Description Product document
September 03, 2020 Billing updates Pricing Billing policies are updated. Pay-as-you-go allows you to use all the basic features of DataWorks in a cost-efficient manner. Pay-as-you-go
September 03, 2020 Updates Experience optimization A topic is updated to describe Alibaba Cloud DataWorks, its features, and limits. What is DataWorks?
September 02, 2020 New tutorial Experience optimization A topic is added to describe how to use DataWorks together with Machine Learning Platform for AI (PAI) to automatically identify users who steal electricity. This ensures that users use electricity in a safe manner. Overview

Changes in August 2020

Date Feature Type Description Product document
August 07, 2020 New resource group Experience optimization A topic is added to describe custom resource groups for scheduling. DataWorks provides custom resource groups for scheduling and custom resource groups for Data Integration to ensure that nodes are flexibly scheduled and data is synchronized as early as possible. This topic describes how to create a custom resource group for scheduling and change the resource group that is used to run nodes. Create custom resource groups for scheduling
August 07, 2020 New connection New feature A topic is added to describe how to configure a Hive connection. A Hive connection allows you to read data from and write data to Hive. You can configure sync nodes by using the codeless user interface (UI) or code editor. Configure a Hive connection
August 07, 2020 New connection New feature A topic is added to describe how to configure a GBase 8a connection. A GBase 8a connection allows you to read data from and write data to GBase 8a. You can configure sync nodes by using the codeless UI or code editor. Configure a GBase 8a connection
August 07, 2020 New connection New feature A topic is added to describe how to configure a Hologres connection. A Hologres connection allows you to read data from and write data to Hologres. You can configure sync nodes by using the codeless UI or code editor. Configure a Hologres connection
August 07, 2020 New connection New feature A topic is added to describe how to configure an HBase connection. An HBase connection allows you to read data from and write data to HBase. You can configure sync nodes by using the code editor. Configure an HBase connection
August 07, 2020 New connection New feature A topic is added to describe how to configure an Elasticsearch connection. An Elasticsearch connection allows you to read data from and write data to Elasticsearch. You can configure sync nodes by using the code editor. Configure an Elasticsearch connection
August 07, 2020 New FAQ Experience optimization A topic is added to describe how to troubleshoot issues related to connectivity, parameters, and permissions when you create connections in DataWorks. Troubleshooting for connections
August 07, 2020 New feature New feature A topic is added to describe how to create an E-MapReduce Presto node. E-MapReduce Presto nodes allow you to perform interactive analysis and query on large-scale structured and unstructured data. EMR Presto node
August 05, 2020 New SDK for Java New SDK for Java A topic is added to describe how to configure the required dependencies and environment for SDK for Java. Install the Alibaba Cloud SDK for Java
August 05, 2020 New release notes of features Experience optimization A topic is added to describe DataWorks features and documentation. Release notes of key features

Changes in June 2020

Date Feature Type Description Product document
June 30, 2020 New FAQ Experience optimization Topics are added to provide answers to commonly asked questions about DataWorks services and features. Such services and features include Data Integration, DataStudio, custom resource groups, exclusive resource groups, dependencies, Alarm, and DataService Studio. FAQ
June 28, 2020 New feature New feature A topic is added to describe how to add a route to a virtual private cloud (VPC) or a data center. Add a route
June 28, 2020 New best practice Experience optimization A best practice is added to describe how to use exclusive resources for Data Integration to migrate data from a self-managed MySQL database hosted on Elastic Compute Service (ECS) to MaxCompute. Migrate data from a user-created MySQL database on an ECS instance to MaxCompute
June 28, 2020 New best practice Experience optimization A best practice is added to describe how to use Artificial Intelligence Recommendation. Artificial Intelligence Recommendation is based on the Alibaba cutting edge big data and artificial intelligence (AI) technologies as well as years of experience in the e-commerce industry. It provides a personalized recommendation service for developers to increase the customer purchase rate and order conversion rate. Intelligently recommend items on e-commerce websites
June 28, 2020 New best practice Experience optimization A best practice is added to describe how to grant specified users access to specified resources such as tables and user-defined functions (UDFs). This best practice involves data encryption and decryption algorithms and relates to data security. Grant access to a specific UDF to a specified user
June 28, 2020 New best practice Experience optimization A best practice is added to describe how to build a data warehouse for an enterprise based on AnalyticDB for MySQL and use the data warehouse for O&M and metadata management. Build a data warehouse for an enterprise based on AnalyticDB for MySQL
June 28, 2020 New best practice Experience optimization A best practice is added to describe how to use a PyODPS node in DataWorks to segment Chinese text based on Jieba, an open source segmentation tool. The topic also describes how to write the segmented words and phrases to a new table and use closure functions to segment Chinese text based on a custom dictionary. Use a PyODPS node to segment Chinese text based on Jieba
June 28, 2020 New best practice Experience optimization A best practice is added to describe how to use a PyODPS node that is running on an exclusive resource group to send emails. Use a PyODPS node to send emails
June 28, 2020 New best practice Experience optimization A best practice is added to describe how to connect DataV to DataWorks DataService Studio. You can create APIs in DataService Studio and call the APIs in DataV. DataV then presents analysis results of the MaxCompute data. Connect DataV to DataWorks DataService Studio
June 28, 2020 New best practice Experience optimization A best practice is added to describe how to automatically synchronize Internet of Things (IoT) data to the cloud. The IoT is a network that carries data based on the Internet and traditional telecommunication networks. It enables connections among all physical objects that are independently addressable. Automatically synchronize IoT data to the cloud
June 16, 2020 New tutorial Experience optimization A tutorial is added to describe how to ensure data quality. Data quality is the basis for effective and accurate data analysis. Overview
June 16, 2020 New tutorial Experience optimization A tutorial is added to help you understand and use big data services of Alibaba Cloud to build an online operation analysis platform. Business scenarios and development process
June 15, 2020 New connection New connection A topic is added to describe how to configure an ApsaraDB for OceanBase connection. An ApsaraDB for OceanBase connection allows you to read data from and write data to ApsaraDB for OceanBase. You can configure sync nodes by using the code editor. Configure an ApsaraDB for OceanBase connection
June 15, 2020 New connection New connection A topic is added to describe how to configure a Vertica connection. A Vertica connection allows you to read data from and write data to Vertica. You can configure sync nodes by using the code editor. Configure a Vertica connection
June 15, 2020 New plug-in New plug-in A topic is added to describe the data types that are supported by Gbase8a Reader and the parameters that you can use to configure Gbase8a Reader. For example, you can specify the connection and configure field mapping for Gbase8a Reader. This topic also provides an example for configuring Gbase8a Reader. Gbase8a Reader
June 15, 2020 New plug-in New plug-in A topic is added to describe Hologres Reader. Hologres Reader allows you to export data from Hologres data warehouses. You can read data from Hologres tables and then write the data to other data stores based on the standard protocol of Data Integration. Hologres Reader
June 15, 2020 New plug-in New plug-in A topic is added to describe Time Series Database (TSDB) Reader. TSDB Reader allows you to read data from TSDB. This topic describes the data types that are supported by TSDB Reader and the parameters that you can use to configure TSDB Reader. For example, you can specify the connection and configure field mapping for TSDB Reader. This topic also provides an example for configuring TSDB Reader. TSDB Reader
June 15, 2020 New plug-in New plug-in A topic is added to describe Hologres Writer. Hologres Writer allows you to import data from multiple data stores to Hologres for real-time data analysis. Hologres Writer
June 15, 2020 New configuration New configuration A topic is added to describe how to configure a resource group for node scheduling. You can select the required resource group for node scheduling in the Resource Group section. Configure the resource group
June 15, 2020 New description Experience optimization A topic is added to describe the logic of scheduling dependencies. Correct node dependencies are fundamental to effective workflows, can ensure that business data is effectively and correctly produced, and can realize standard data development. Logic of scheduling dependencies
June 15, 2020 New resource group New feature A topic is added to describe how to add and use exclusive resource groups for scheduling. DataWorks allows you to bind an exclusive resource group for scheduling to a VPC so that the resource group can access data stores in the VPC. Create and use an exclusive resource group for scheduling

Changes in May 2020

Date Feature Type Description Product document
May 27, 2020 New usage notes Experience optimization A topic is added to describe the scenarios and methods of using shared resource groups, exclusive resource groups, and custom resource groups in DataWorks. Usage notes of DataWorks resource groups
May 27, 2020 New feature New feature A topic is added to describe how to manage report templates. You can create a template of data quality reports on the Report Template Management page. DataWorks Data Quality can periodically generate and send data quality reports based on the template. Create and manage report templates
May 27, 2020 New feature New feature A topic is added to describe how to manage rule templates. DataWorks Data Quality allows you to manage a set of custom rules and create a rule template library to configure rules more efficiently. Create, manage, and use rule templates
May 27, 2020 New feature New feature A topic is added to describe the verification logic of Data Quality and the built-in rule templates for monitoring offline data. Built-in rule templates for offline data

Changes in April 2020

Date Feature Type Description Product document
April 19, 2020 Product upgrade DataWorks V3.0 Topics are updated to describe how to use Operation Center. In Operation Center, you can view the dashboard, manage auto triggered nodes and manually triggered nodes, and monitor nodes. Operation Center
April 18, 2020 Product upgrade DataWorks V3.0 Topics are updated to describe the overall process of building a MaxCompute data warehouse. Overall process
April 18, 2020 Product upgrade DataWorks V3.0 Topics are updated to describe Data Integration. Data Integration is a stable, efficient, and scalable data synchronization service. It is designed to migrate and synchronize data between a wide range of heterogeneous data stores fast and stably in complex network environments. Data Integration
April 08, 2020 Product upgrade DataWorks V3.0 A topic is updated to guide you through a complete process of data analytics and O&M. Overview
April 08, 2020 Product upgrade DataWorks V3.0 Topics are updated to provide an overview of DataWorks, including the basic concepts, usage scenarios, and data analytics processes. What is DataWorks?

Changes in March 2020

Date Feature Type Description Product document
March 26, 2020 New tutorial Experience optimization A tutorial is added to describe the complete operations in the DataWorks for E-MapReduce workshop. DataWorks for EMR Workshop
March 17, 2020 Product upgrade DataWorks V3.0 A topic is updated to describe the upgraded data analytics mode. In the upgraded data analytics mode, you can group multiple workflows under a solution in a workspace. The previous hierarchical structure is abandoned. Create a solution
March 17, 2020 Product upgrade DataWorks V3.0 Topics are updated to describe various types of nodes in DataWorks, including batch sync nodes, MaxCompute nodes, E-MapReduce nodes, general nodes, and custom nodes. Node types
March 17, 2020 Product upgrade DataWorks V3.0 Topics are updated to describe how to configure DataStudio settings on the Setup page. For example, you can configure code and folder settings, change themes, and display or hide modules. Setup
March 02, 2020 Product upgrade DataWorks V3.0 Topics are updated to provide an overview of the DataWorks console. You can view the workspaces, resource groups, and compute engines in the DataWorks console. DataWorks console overview

Changes in February 2020

Date Feature Type Description Product document
February 29, 2020 New best practice Experience optimization A best practice is added to describe how to use the data synchronization feature of DataWorks to migrate data from Oracle to MaxCompute. Best practice to migrate data from Oracle to MaxCompute
February 28, 2020 New feature New feature A topic is added to describe the Projects page in App Studio. You can create and manage projects on the Projects page. Projects
February 28, 2020 New feature New feature A topic is added to describe the Apps page in App Studio. The Apps page displays applications that are created by you and shared by you, and third-party applications. Apps
February 28, 2020 New feature New feature A topic is added to describe the Templates page in App Studio. The Templates page displays all templates that are created based on projects. Templates
February 28, 2020 New feature New feature A topic is added to describe how to create an application in App Studio and deploy it in the production environment so that the application can be accessed by using the Internet. App deployment
February 02, 2020 New feature New feature Topics are added to describe how to use the DataAnalysis service. DataAnalysis allows you to collaboratively edit and analyze workbooks, manage MaxCompute tables in tabular mode, and generate and share visual reports. DataAnalysis

Changes in December 2019