This topic describes the change history of the DataWorks documentation. You can learn the new features and feature changes of DataWorks.
DataWorks can be automatically updated. The update has no impact on existing users.
Changes in August 2023
Date | Item | Category | Description | References |
2023.8.29 | New feature | DataService Studio | A topic is updated to describe the instance types that are newly supported for exclusive resource groups for DataService Studio. The instance types are api.s2.small, api.s2.medium, and api.s2.large. | Billing of exclusive resource groups for DataService Studio (subscription) |
2023.8.29 | New feature | Operation Center | Topics are updated to describe the following feature that is newly supported in Operation Center: You can adjust the priority of the YARN queue of a node by configuring a priority mapping between the baseline to which the node belongs and a YARN queue. | |
2023.8.28 | New feature | SettingCenter | A topic is updated to describe a built-in workspace-level role named Role_Project_Scheduler. This role is newly added and can be used to schedule and run MaxCompute tasks in the production environment. | |
2023.8.25 | New feature | Data Modeling | A topic is added to describe the relationship diagram feature that is newly provided. This feature allows you to quickly build an architecture for models in your data warehouse, which can intuitively display relationships between the models in your data warehouse. Each relationship diagram displays relationships between models in one data warehouse. You can create multiple relationship diagrams within one Alibaba Cloud account. | |
2023.8.25 | New feature | Data Integration | A topic is added to describe Amazon Redshift data sources that are newly supported. You can use Amazon Redshift Reader and Amazon Redshift Writer to read data from and write data to Amazon Redshift data sources and can configure a synchronization task for an Amazon Redshift data source by using the codeless user interface (UI) or code editor. | |
2023.8.24 | New feature | Operation Center | A topic is added to describe the scheduling calendar feature that is newly provided. This feature allows you to define scheduling dates and scheduling methods for tasks in a more flexible manner. | |
2023.08.16 | Updated feature | SettingCenter | A topic is added to describe how to create a MaxCompute data source of the new version. To provide a better user experience, the DataWorks development team released a new version of MaxCompute data sources, into which operations related to MaxCompute compute engines are integrated. For example, you can create or edit a MaxCompute compute engine on the pages related to MaxCompute data sources in the DataWorks console. In addition, changes have occurred on permissions on MaxCompute data sources. | |
2023.08.15 | New feature | Operation Center | A topic is updated to describe the trigger conditions that are newly supported if you set the Object Type parameter to Workspace for a custom alert rule. The trigger conditions include the number of instances with errors, the proportion of instances with errors, and task logs that contain specific keywords. | |
2023.08.04 | New feature | Data Integration | A topic is added to describe how to create and configure a synchronization task in Data Integration to synchronize data from Kafka to a data lake, such as Object Storage Service (OSS), in real time. | Synchronize data from a Kafka table to OSS (Hudi) in real time |
Changes in July 2023
Date | Item | Category | Description | References |
2023.7.31 | Optimized setting | DataService Studio | A topic is updated to optimize the architecture and content of the topics for DataService Studio. | |
2023.7.31 | Updated feature | Data Governance Center | Topics are updated to describe how to handle governance issues and check events of MaxCompute and E-MapReduce (EMR) data sources. | |
2023.7.25 | Updated feature | DataWorks console | A topic is updated to describe the updates of the features on various pages in the DataWorks console. | |
2023.7.18 | New feature | Data Integration | A topic is added to describe how to create and configure a real-time extract, transform, and load (ETL) synchronization task to synchronize data from Simple Log Service to Hologres. | Create a real-time ETL synchronization task to synchronize data from Simple Log Service to Hologres |
2023.7.16 | New feature | Data Modeling | A topic is added to describe composite metrics. Composite metrics are calculated based on specific derived metrics and calculation rules and are fine-grained metrics that can help you collect statistics about your business in a flexible manner. | |
2023.7.13 | New feature | Data Integration | A topic is added to describe how to create and configure a real-time ETL synchronization task to synchronize data from Kafka to Hologres. A real-time ETL synchronization task initializes the schema of a destination Hologres table based on the structure of a specified Kafka topic, and then synchronizes full data from the topic to the table at a time and synchronizes incremental data from the topic to the table in real time. | Create a real-time ETL synchronization task to synchronize data from Kafka to Hologres |
2023.07.08 | New feature | Management and control | A topic is added to describe the built-in logic of a default workspace. A default workspace is generated the first time you activate DataWorks or if you activate DataWorks in a new region. | |
2023.07.06 |
|
|
|
Changes in June 2023
Date | Item | Category | Description | References |
2023.6.30 | New feature | DataStudio | A topic is updated to describe how to configure a code template for node types such as PyODPS 3 and EMR Spark SQL. | |
2023.6.29 | New feature | DataStudio | A topic is added to describe how to create and use a Function Compute node. You can use Function Compute nodes to periodically schedule event processing functions and complete integration and joint scheduling with other types of nodes. | |
2023.6.29 | New feature | SettingCenter | A topic is updated to describe the following operations that are supported when you associate an EMR compute engine with a workspace:
| |
2023.6.27 | Updated feature | Operation Center | A topic is updated to describe the modification to the Overview page in Operation Center. After the modification, the page displays the overall O&M information, including the results of O&M stability assessment, key O&M metrics, usage of scheduling resources, and status information of auto triggered tasks. The page also displays information about synchronization tasks in Data Integration. This helps you quickly understand the overall information of tasks in your workspace, identify and handle exceptions at the earliest opportunity, and improve O&M efficiency. | |
2023.6.25 | New feature | Data Modeling | A topic is updated to describe the following feature that is supported when you perform data modeling by using the code editor on the System Management page in Data Warehouse Planning: Specify whether the Comment field in DDL statements of compute engines in code corresponds to a display name or a description that is specified by a parameter in the codeless UI based on your business requirements. | |
2023.6.16 | New feature | DataStudio |
| |
2023.6.10 | Updated feature | DataStudio | The structure and content of the Develop a MaxCompute Spark task topic are optimized. |
Changes in May 2023
Date | Item | Category | Description | References |
2023.5.22 | New feature | SettingCenter | A topic is updated to describe the service-linked roles for the Alibaba Cloud services to which compute engines belong. If you want to perform compute engine-related operations in the DataWorks console, such as associating a compute engine with a workspace or modifying an existing compute engine instance, the system prompts you to perform authorization operations for DataWorks. After the authorization is complete, the system creates a service-linked role for the Alibaba Cloud service to which the related compute engine belongs. | |
2023.5.10 | Updated feature | Open Platform | A topic is updated to describe the optimization and updates to the graphical user interface (GUI) of the Open Platform service. |
Changes in April 2023
Date | Item | Category | Description | References |
2023.4.19 | New feature | Data Integration | A topic is added to describe how to create and configure a batch synchronization task to synchronize all data in an EMR Hive database to MaxCompute at a time. | Synchronize data in an EMR Hive database to MaxCompute in offline mode |
2023.4.17 | Optimized setting | Management and control | A topic is added to describe how to change the time zone for scheduling. Before you create a DataWorks workspace, you must select the region in which you want to create the workspace. By default, the time zone of the region in which the DataWorks workspace resides is the time zone for scheduling. You can change the time zone for scheduling based on your business requirements. | |
2023.4.14 | New feature | Data Integration | A topic is added to describe how to create and configure a batch synchronization task to synchronize all data in a MySQL database to Hive at a time. | Synchronize full data from a MySQL database to Hive at a time |
2023.4.12 | Updated feature | Data Integration | Topics are updated to describe the data read modes and data write modes that are newly supported for reading data from and writing data to wide tables and time series tables in Tablestore data sources and OTSStream data sources. The new modes are the row mode and column mode. |
Changes in March 2023
Date | Item | Category | Description | References |
2023.3.28 | Updated feature | Data Map | A description for creating crawlers and collecting metadata from various data sources to DataWorks by using the created crawlers is provided. | |
2023.3.23 | New feature | Data Integration | Topics are updated to describe the LogView feature that is newly provided. You can use the feature to view running information about the batch and real-time synchronization tasks. | |
2023.3.21 | Updated feature | Data Modeling | Topics are updated to describe the optimization and updates that are related to data layers. The features of data layer checkers are optimized. The rules defined in all checkers that are created for tables or derived metrics at the same data layer have the same strength type. The strength type is strong or weak. | |
2023.3.02 | New feature | Data Integration | A topic is added to describe how to create and configure a batch synchronization task to synchronize all data in an ApsaraDB for ClickHouse database to Hologres at a time. | |
2023.3.02 | New feature | DataStudio | An overview for scheduling properties is provided. If you want the system to periodically schedule a task, you must define scheduling properties such as the scheduling cycle, dependencies, and scheduling parameters for the task. |
Changes in February 2023
Date | Item | Category | Description | References |
2023.2.28 | New feature | Data Governance Center | Custom configurations of notifications for governance issues that are displayed on the Governance issues page in the DataWorks console are supported. Notifications can be sent to specified personnel by system message, email, or DingTalk group message. This way, the governance issues can be viewed and handled at the earliest opportunity. | |
2023.2.26 | Updated feature | DataStudio | A topic is updated to describe the optimizations to the procedures of undeploying auto triggered tasks and restoring undeployed tasks, and the processing solutions for instances that are generated but are not run and instances that are running after tasks are undeployed. | |
2023.2.21 | New feature | DataStudio | A topic is added to describe a general development process for data development. Different types of compute engine tasks can be encapsulated into different types of nodes to define data development tasks. Resources, functions, and related logic processing nodes can be used to develop more complex tasks. You can refer to the general development process for tasks in DataStudio to perform data development. | |
2023.2.17 | Optimized setting | Data Integration | The architecture and content of topics in the Data Integration documentation are adjusted. | |
2023.2.16 | Updated feature | DataStudio | The description for configuring and using OSS object inspection nodes is optimized. | |
2023.2.14 | New feature | Migration Assistant | A topic is added to describe how to export the tasks in DolphinScheduler and then import the tasks to DataWorks. | |
2023.2.09 | Updated feature | DataStudio | The architecture of topics in the documentation for script templates is adjusted, and the logic for using script templates is optimized. |
Changes in January 2023
Date | Item | Category | Description | References |
2023.1.17 | New feature | DataStudio | An introduction to the procedure of node debugging is provided. You can use features such as run, run with parameters, and quick run to debug complete code or code snippets based on your business requirements. After the debugging is complete, you can view the running results. | |
2023.1.17 | Updated feature | DataStudio | The details of node groups are optimized. A description for deleting a node group is added. | |
2023.1.11 | New feature | Operation Center | A topic is added to describe the intelligent diagnosis feature that is newly provided. You can use this feature to quickly determine the reasons why nodes fail to run. Multiple factors may affect the running of a node. | |
2023.1.10 | New feature | DataStudio | A topic is updated to describe how to search and view operation records in a workspace on the DataStudio page by operation type, operator, or operation time. | |
2023.1.9 | New feature | Data Modeling | A topic is added to describe the system management feature that is newly provided. You can use this feature to manage table creation policies in a data warehouse. For example, you can configure a table creation policy that prohibits users who do not have data models from creating physical MaxCompute tables in DataStudio in the production environment. After you enable the table creation policy, when a user creates or modifies a physical MaxCompute table in DataStudio in the production environment, the system checks the name of the table based on the policy. This ensures standardization of table creation. | |
2023.1.6 | New feature | Data Modeling | A topic is updated to describe how to publish and materialize a table. Tables can be published and materialized to EMR and Hologres compute engine instances. |
Changes in December 2022
Date | Item | Category | Description | References |
2022.12.29 | Updated feature | DataStudio | A topic is updated to describe the optimizations to the operations that are related to the creation and use of MaxCompute tables in the following aspects: visualized creation of MaxCompute tables, committing and deployment of MaxCompute tables, data write to and data export from MaxCompute tables, and query of data in MaxCompute tables. | |
2022.12,23 | Optimized setting | DataStudio | A topic is updated to describe the optimizations to the settings that are related to table management, such as configuring table-related formats, creating or managing folders, and creating or managing layers. | |
2022.12.23 | New feature | Compute engine association | A topic is updated to describe the change on the entry points for associating compute engines with a workspace, and a description is provided for the permissions that are required to associate a compute engine with a workspace. | |
2022.12.15 | New feature | DataStudio | Topics are added to describe the processes of developing Hologres nodes and ODPS nodes in DataWorks. | |
2022.12.6 | New feature | Open Platform | A status change event for a workflow is added. |
Changes in November 2022
Date | Item | Category | Description | References |
2022.11.24 | New feature | DataStudio | Guidance for configuring scheduling dependencies and principles of scheduling configurations in complex dependency scenarios are provided to help you understand the procedure and key points of configuring scheduling dependencies. Before you configure scheduling dependencies, make sure that you are familiar with the guidance and principles. This helps prevent data exceptions caused by inappropriate scheduling dependency configurations. | |
2022.11.23 | New feature | DataStudio | Topics are updated to describe how to create Hologres internal tables and Hologres foreign tables in the DataWorks console. | |
2022.11.18 | Updated feature | Open Platform | A topic is updated to describe the change on the entry point of the Open Platform page. | |
2022.11.17 | New feature | Data Map | A topic is updated to describe how to view the details of and manage a table. Tables can be added to a data album for management, and the data albums to which tables are added can be viewed in Data Map. | |
2022.11.3 | New feature | Security Center | A topic is added to describe how to use the data query and analysis control feature that is newly provided. This feature allows you to authorize a role or member to query a specific data source in a DataWorks module. This feature also allows you to manage the permissions on query results. |
Changes in October 2022
Date | Item | Category | Description | References |
2022.10.21 | Updated feature | Management and control |
| Overview of the DataWorks console and Overview of the features in SettingCenter |
2022.10.20 | New feature | Resource group | The service-linked role AliyunServiceRoleForDataWorks is automatically created by DataWorks the first time you use an exclusive resource group. You can use the role to access resources in a virtual private cloud (VPC), an elastic network interface (ENI), and a security group. The service-linked role can also be created by using a RAM user. |
Changes in September 2022
Date | Item | Category | Description | References |
2022.9.23 | Updated feature | DataWorks console | A topic is updated to describe the optimization to the O&M Assistant feature. This feature allows you to create, run, and delete commands on an exclusive resource group for scheduling. This feature also allows you to view the execution results of the commands. | |
2022.9.22 | New feature | DataStudio | A topic is updated to describe the forcible code review feature and how to enable and use this feature for a workspace in basic mode. | |
2022.9.20 | New feature | Operation Center | Topics are updated to describe how to view the custom alert rule and baseline that are associated with an auto triggered node or an auto triggered instance on the General tab of the node or instance. If no custom alert rule or baseline is associated with the node or instance, you can quickly create a custom alert rule or a baseline on this tab. | View auto triggered node instances, Test an auto triggered node and view test instances generated for the node, and Appendix: Use the features provided in a DAG |
2022.9.19 | Updated feature | Data Integration | Topics are updated to describe how to create and configure a synchronization task that uses DM Reader or DM Writer by using the codeless UI. | |
2022.9.06 | New feature | Data Modeling | A topic is added to describe how to create a dimension. A dimension can be planned and created in the Dimensional Modeling module. You can associate the dimension with a dimension table when you create the dimension table. After the association, you can view your business data based on the dimension. | |
2022.9.06 | New feature | Data Modeling | A topic is added to describe the import feature that is newly supported. This feature provides different types of import templates for objects such as models and data metrics. You can use this feature to import information of multiple objects at the same time based on an object import template to create the objects in Dimensional Modeling. |
Changes in August 2022
Date | Item | Category | Description | References |
2022.8.30 | Updated feature | Data Integration |
| |
2022.8.22 | New feature | Operation Center | The Workflow Perspective tab is added on the Cycle Instance page. On this tab, you can view the status of a workflow in the Workflow column based on the icons that represent the status of auto triggered node instances in the workflow. You can perform different operations on a workflow by clicking entry points in the Actions column. The operations that you can perform on a single auto triggered node instance on the Workflow Perspective tab are the same as the operations that you can perform on the auto triggered node instance on the Instance Perspective tab. | |
2022.8.18 | New feature | Data Modeling | The following features are added to Data Modeling of DataWorks:
| |
2022.8.05 | New feature | DataStudio | Topics are added to describe how to synchronize schemas and data of MaxCompute tables to Hologres.
| |
2022.8.02 | New feature | DataStudio |
|
Changes in July 2022
Date | Item | Category | Description | References |
2022.7.29 | New feature | Data Modeling |
| |
2022.7.29 | New feature | Data Modeling | The model development feature of DataWorks Data Modeling is supported. You can associate a table with an existing node in DataStudio. After the association, you can double-click the name of the node to go to the configuration tab of the node to develop data. | |
2022.7.29 | New feature | Data Modeling | A topic is added to describe how to configure and use a checker that checks the names of derived metrics at a data layer. A checker at a data layer can define a naming convention for derived metrics at the data layer to help reduce O&M costs. | |
2022.7.8 | New feature | DataStudio | An EMR DataLake cluster can be associated with a DataWorks workspace as a compute engine instance. This way, you can develop and run EMR nodes based on the compute engine instance. Topics are added to describe the development process of an EMR node in DataWorks, the configurations of an EMR DataLake cluster in DataWorks, and permission management when a user runs EMR nodes in DataWorks. | |
2022.7.2 | Updated feature | DataStudio | Topics are updated to describe the following new scenarios in which zero load nodes can be used:
|
Changes in June 2022
Date | Item | Category | Description | References |
2022.6.28 | New feature | Data Modeling | A topic is updated to describe how to perform reverse modeling on physical tables. The fuzzy match rule can be specified in a reverse modeling policy to match physical table names. | |
2022.6.27 | New feature | Data Security Guard | A topic is updated to describe how to identify sensitive data. You can set the Scanning range parameter to Custom range when you create a sensitive data identification task on the Sensitive data identification page in Data Security Guard in the DataWorks console. In addition, you can view the progress of and logs for sensitive data identification. | |
2022.6.22 | Updated feature | Open Platform | The service that is used to subscribe to and consume messages in OpenEvent is changed from Kafka to EventBridge. | |
2022.6.16 | Updated feature | DataStudio | The scenario in which scheduling dependencies must be configured for nodes across workflows or workspaces is added. | Scenario 3: Configure dependencies for nodes across workflows or workspaces |
2022.6.13 | New feature | DataStudio | Features in DataStudio can be displayed based on the permissions of a user, and custom display of features on the DataStudio page based on business requirements is supported. This can help you easily get started with DataStudio. | |
2022.6.2 | New feature | Data Integration | Query of the data that is synchronized to MaxCompute after the related synchronization task finishes running is supported. | |
2022.6.2 | New feature | Data Integration | A topic is added to describe how to add a StarRocks data source to DataWorks. You can configure synchronization tasks that use StarRocks Reader or StarRocks Writer to read data from or write data to StarRocks data sources in the codeless UI or code editor. |
Changes in May 2022
Date | Item | Category | Description | References |
2022.5.23 | New feature | Approval Center | Topics are updated to describe how to create a request processing policy that is used when a Data Integration node is saved. Such a request processing policy can be created by a user that is assigned the Workspace Administrator role and takes effect for the workspace in which the policy is created. | |
2022.5.22 | Updated feature | Data Security Guard |
| |
2022.5.18 | New feature | Data Security Guard | A topic is added to describe how to use the data lineage feature of Data Security Guard to visualize the lineage of sensitive data, analyze abnormal associations between fields, and identify fields whose identification results are abnormal. The data lineage feature provides information about the spread and impacts of sensitive data and helps efficiently identify sensitive data. | |
2022.5.18 | New feature | Data Modeling | A topic is added to describe the Homepage of Data Modeling. On the Homepage, you can view the number of tables and derived metrics in the current workspace of your account. You can also view the tables that are successfully published to the production environment within the last 30 days. This way, you can obtain an overview of the tables. | |
2022.5.13 | New feature | API | A topic is added to describe how to query migration tasks. | |
2022.5.11 | New feature | Data Integration | A topic is added to describe how to use HBase20xsql Reader to read data from Phoenix tables that are mapped to HBase SQL tables. | |
2022.5.12 | Updated feature | Commercial use | The architecture of the "Billing overview" topic is adjusted. | |
2022.5.10 | New feature | Intelligent monitoring |
|
Changes in April 2022
Date | Item | Category | Description | References |
2022.04.29 | Updated feature | Billing rule and resource group |
| |
2022.04.17 | Updated feature | Edition and resource group | A topic is added to describe how to change the specifications of a resource group. The added topic also describes how to prepare for the change of specifications. In the Change preparation step, you need to confirm the possible impact of the operation and determine whether to allow the system to automatically rerun the terminated node after the change is complete. This improves user experience. | |
2022.04.15 | Updated feature | Intelligent baseline |
| |
2022.04.15 | New feature | Data Analyst role | By default, users with the Data Analyst role have permissions only on DataAnalysis. | |
2022.04.14 | New feature | Basic operations in the DataWorks console | By default, after you select a region, the time zone for the region that you select is automatically used as the time zone for scheduling. This indicates that the time zone is used when you configure the scheduling time for a node. When you create a workspace in the US (Silicon Valley) or Germany (Frankfurt) region for the first time, a message appears. In the message, you can submit a ticket to contact technical support to set the time zone for scheduling to the UTC+8 time zone. | |
2022.04.13 | New feature | Data Security Guard |
|
Changes in March 2022
Date | Item | Category | Description | References |
2022.03.28 | New feature | DataStudio | A topic is added to describe the quick run feature of DataWorks. This feature allows you to quickly run the code snippet that you select on the configuration tab of a node. You can use this feature to test whether a code snippet is correctly written. The added topic describes how to quickly run a code snippet of a node. | |
2022.03.25 | Updated feature | DataStudio | A topic is updated to describe the new features on the DataStudio page. This helps you understand the overall layout of the DataStudio page and the features on this page and view relevant topics with ease. The following features are added:
| |
2022.03.21 | Updated feature | Data governance | A topic is updated to describe how to filter governance items and check events by role from the personal perspective. | |
2022.03.20 | Updated feature | Updates |
| |
2022.03.17 | Updated feature | Data Map |
| |
2022.03.17 | Updated feature | Scheduling parameter | A topic is updated to describe the adjusted overall structure and logic of topics related to scheduling parameters. In DataWorks, nodes are scheduled to run based on scheduling parameters. Scheduling parameters are automatically replaced with specific values based on the data timestamps of the nodes, the time when the nodes are scheduled to run, and the value formats of the scheduling parameters. This enables dynamic parameter settings for node scheduling. This way, you can quickly view information about scheduling parameters and use them. | |
2022.03.16 | Updated feature | DataService Studio | A topic is updated to describe how to use a function as a filter for an API. If you need to use a filter to preprocess the request parameters of an API or perform secondary processing on query results of the API, perform the following operations to configure a filter: In the right-side navigation pane of the configuration tab of the API, click Filter. On the Filter tab, select Use Pre-filter or Use Post-filter based on your business requirements. | |
2022.03.07 | Updated feature | Data Security Guard |
|
Changes in February 2022
Date | Item | Category | Description | References |
2022.02.08 | Updated feature | Data Integration | Topics are updated to describe how to configure batch synchronization tasks that use different Writers in the codeless UI. | |
2022.02.15 | Updated feature | DataStudio | Topics are updated to describe how to configure related settings on the DataStudio page.
| |
2022.02.20 | New feature | Scheduling dependency | A topic is added to describe how to fix the following issue: After you enable automatic parsing for a node, the scheduling dependencies of the node are different from those that are identified by DataWorks when you commit the node. | |
2022.02.25 | Updated feature | DataStudio | A topic is updated to describe how to create a merge node and define the merging logic for the node. |
Changes in January 2022
Date | Item | Category | Description | References |
2022.01.20 | New feature | Data Modeling | A topic is added to describe how to create an application table in Data Modeling. Each application table is suitable for different business scenarios. An application table is used to organize statistical data collected by atomic and derived metrics of the same statistical period, dimension, and statistic granularity. This allows you to perform subsequent business queries, online analytical processing (OLAP) analysis, and data distribution in an efficient manner. | |
2022.01.18 | New feature | Data Modeling | A topic is added to describe how to create and manage dimensions in Data Modeling. The dimension management feature allows you to create and manage dimensions in a centralized manner to ensure that each dimension is unique. | |
2022.01.18 | New feature | Data Modeling | Topics are added to describe how to create data marts and manage subject areas in Data Modeling.
| |
2022.01.16 | New feature | DataStudio | A topic is added to describe how to configure same-cycle scheduling dependencies between nodes in DataStudio. After you configure scheduling dependencies for a node, you can preview the scheduling dependencies of the node from the node dependency and instance dependency dimensions. This allows you to modify the scheduling dependencies that do not meet your business requirements at the earliest opportunity. | |
2022.01.15 | Updated feature | DataStudio | A topic is updated to describe how to configure a resource group for scheduling for auto triggered nodes in DataStudio. Running of auto triggered nodes depend on resource groups for scheduling. You can select a resource group in the Resource Group section of the Properties tab for the node. | |
2022.01.14 | New feature | DataStudio | A topic is added to describe how to enable periodic scheduling and configure the scheduling settings for auto triggered nodes in DataStudio. To run auto triggered nodes as scheduled, you must go to the Scheduling Settings tab in DataStudio to enable periodic scheduling. | |
2022.01.14 | New feature | DataStudio | A topic is updated to describe how to configure the rerun property for a node in DataStudio. DataStudio allows you to configure rerun-related parameters in the Schedule section of the Properties tab of a node. | |
2022.01.14 | New feature | DataStudio | A topic is updated to describe the types of scheduling parameters and the precautions for using scheduling parameters in DataStudio. You can assign built-in parameters to scheduling parameters as values in the Scheduling Parameter section of the Properties tab of a node based on your business requirements. | |
2022.01.12 | New feature | DataAnalysis | A topic is added to describe how to write Markdown texts and SQL code, run the code for queries, and then save the query results by using the SQLNotes feature of DataWorks. | |
2022.01.06 | Updated feature | DataStudio | A topic is added to describe the features on the DataStudio page. This helps you understand the overall layout of and features on the DataStudio page and view relevant topics with ease. |
Changes in December 2021
Date | Item | Category | Description | References |
2021.12.27 | New feature | Data Map | A topic is added to describe how to create and manage a Cloudera Distribution Hadoop (CDH) Hive sampling crawler in Data Map. Data Map allows you to use the sampling crawler to sample a CDH Hive table. This way, Data Security Guard can detect sensitive data. If you configure data masking rules in Data Security Guard, data of the sensitive fields that match the rules is masked when you preview data on the details page of a table in Data Map. | |
2021.12.24 | New feature | API |
| GetDISyncTask, DeployDISyncTask, GetDISyncInstanceInfo, and TerminateDISyncInstance |
2021.12.20 | New feature | DataService Studio | Topics are added to describe how to create an Aviator function, use an Aviator function as the prefilter or post-filter for an API, and how to edit code for the Aviator function based on the Aviator syntax. | Create an Aviator function and Best practices of using Aviator functions as filters |
2021.12.14 | New feature | Data Quality | A topic is added to describe how to configure monitoring rules based on a monitoring rule template. Data Quality provides various built-in table-level and field-level monitoring rule templates based on which you can configure monitoring rules. | |
2021.12.09 | Updated feature | Usage analysis | A topic is added to describe how to view the data governance status in Data Governance Center. Data Governance Center allows you to view the data governance status from the following three perspectives: data production, data usage, and data management. You can select a perspective based on your business requirements to facilitate data governance. The data pivoting feature allows data developers and administrators to view and analyze the information about tables, running status of tasks, and resource usage in one or all workspaces. This helps data developers and administrators allocate resources. | |
2021.12.02 | New feature | API |
| API operations related to extension point events: API operations that can be used to obtain and generate asynchronous information required by synchronization nodes: |
Changes in November 2021
Date | Item | Category | Description | References |
2021.11.24 | Updated feature | Data Integration | Topics are added to describe how to configure batch synchronization tasks that use HDFS Reader or HDFS Writer in the codeless UI. | |
2021.11.20 | New feature | API | A topic is added to describe how to call the ListDags operation to obtain the details of DAGs for a single data backfill instance based on OpSeq. OpSeq is the unique identifier for data backfill. | |
2021.11.14 | New feature | DataStudio | A topic is added to describe how to perform operations on multiple DataWorks objects at the same time. DataWorks allows you to modify configurations such as the owner of multiple nodes, resources, or functions at the same time. After the modification, you can commit and deploy the nodes, resources, or functions to the production environment for the modifications to take effect. | |
2021.11.08 | New feature | DataStudio | A topic is added to describe the resource group orchestration feature. This feature allows you to change resource groups for the scheduling of multiple nodes in a workflow at the same time. If multiple resource groups for scheduling exist in your workspace, you can change the resource groups for scheduling of nodes in the workspace based on your business requirements. This can facilitate reasonable resource usage. |
Changes in October 2021
Date | Item | Category | Description | References |
2021.10.26 | New feature | Data Modeling |
| |
2021.10.22 | Updated feature | Data Security Guard |
| |
2021.10.15 | New feature | API |
| |
2021.10.14 | New feature | API | A topic is added to describe how to call API operations to create, configure, and manage a synchronization task in Data Integration. | Use API operations to create, configure, and manage a batch synchronization task |
2021.10.11 | New feature | DataStudio | A topic is added to describe the code search feature. The code search feature allows you to query code snippets in the code of nodes by keyword. The search results show the details of each code snippet and the nodes whose code contains the code snippets. You can use this feature to trace the node that causes changes in a table. |
Changes in September 2021
Date | Item | Category | Description | References |
2021.09.30 | New feature | Scheduling settings in DataStudio | A topic is added to describe the configurations of scheduling parameters. Scheduling parameters are used during the running of DataWorks nodes. The values of scheduling parameters are automatically replaced with specific values based on the data timestamps of the nodes and the value formats of the scheduling parameters. This enables dynamic parameter settings during the running of nodes. | |
2021.09.30 | New feature | Scheduling settings in DataStudio | A topic is added to describe how to configure cross-cycle dependencies between nodes and the types of cross-cycle dependencies supported by DataWorks. If you configure cross-cycle dependencies for a node, the instance of the current node in the current cycle can be run only if the instance of the node on which the current node depends in the previous cycle is successfully run. | |
2021.09.26 | New feature | Data Map | Topics are added to describe the new features of the Data Map service. Data Map allows you to query APIs in all workspaces that are owned by the current tenant and view details of the APIs. This enables quick queries. On the details page of an API, you can view the basic information, parameters, and sample responses of the API. | |
2021.09.15 | New feature | DataAnalysis | A topic is added to describe how to use SQL statements to query and analyze data of the added data sources in DataAnalysis. | |
2021.09.02 | New feature | Operation Center | A topic is added to describe the advanced mode to generate data backfill instances for auto triggered nodes. The advanced mode is used to generate data backfill instances for multiple nodes at the same time. You can select nodes that may not have dependencies with each other. You can select nodes for which you want to backfill data in the DAG of an auto triggered node or in the node list on the Cycle Task page. | Backfill data for an auto triggered node and view data backfill instances generated for the node |
Changes in August 2021
Date | Item | Category | Description | References |
2021.08.29 | New feature | Data Integration | A topic is added to describe how to use the data masking feature. This feature can mask sensitive data in a single table that is synchronized in real time and store the data in a specific database. | |
2021.08.22 | New feature | Data Integration | A topic is added to describe how to synchronize data to Kafka in real time by using the Data Integration service of DataWorks. | |
2021.08.11 | New feature | SSL-based authentication | Topics are updated to describe the SSL-based authentication for data sources. If you add data sources such as MySQL, SQL Server, and PostgreSQL data sources, you can enable SSL-based authentication for the data sources. After the configuration is complete, only trusted applications and services can access the data resources. In DataWorks, SSL-based authentication is provided as a third-party identity authentication mechanism. Third-party identity authentication mechanisms are used to perform strict identity authentication on users and services. These mechanisms prevent untrusted applications or services from accessing data and improve the stability of data access during data synchronization. | |
2021.08.07 | Updated feature | Permission management system | A topic is added to describe the permission management system of DataWorks. The permission management system of DataWorks consists of two parts: permissions controlled by using RAM and permissions controlled by DataWorks. | |
2021.08.01 | New feature | Migration Assistant | A topic is updated to describe the Migration Assistant service of DataWorks. Migration Assistant was officially commercialized on August 1, 2021. Migration Assistant allows you to migrate data objects across different DataWorks editions, Alibaba Cloud accounts, regions, and workspaces. You can export the data objects in your workspace, including auto triggered nodes, manually triggered nodes, resources, functions, data sources, table metadata, ad hoc queries, and script templates. You can also create full export tasks, incremental export tasks, or custom export tasks to export your data objects in DataWorks based on your business requirements. |
Changes in July 2021
Date | Item | Category | Description | References |
2021.07.22 | New feature | API operation | A topic is added to describe how to call the CreateDISyncTask operation to create a batch synchronization task. | |
2021.07.14 | New feature | Configurations in the DataWorks console | A topic is added to describe how to add a RAM user or a RAM role as an alert contact on the Alert Contacts page in the DataWorks console. If an error occurs during the running of a node, DataWorks sends alert notifications to the specified alert contact. This allows you to handle exceptions at the earliest opportunity. | |
2021.07.09 | New feature | Billing | A topic is updated to describe the billing method for different editions that are used in the China East 2 Finance and China South 1 Finance regions. | |
2021.07.03 | New feature | Data Security Guard | A topic is added to describe how to use the data traceability feature provided by DataWorks. This feature allows you to extract the watermark information of the data in a leaked data file. This helps you trace users who caused data leaks. | |
2021.07.02 | New feature | Data Security Guard | A topic is added to describe how to create and manage sample libraries based on the sample files that you provide. You can associate a sample library with a data identification rule to identify data. If the data to be identified contains the data in the sample library, the data to be identified matches the data identification rule. You can use sample libraries to identify enumerated values, such as employee names and user addresses. | |
2021.07.02 | New feature | Data Security Guard | A topic is added to describe how to use sample fields to train models. DataWorks extracts the characteristics of these fields and generates a rule model. You can use this rule model to identify the data that has similar characteristics in your data assets. |
Changes in June 2021
Date | Item | Category | Description | References |
2021.06.11 | New feature | DataStudio | A topic is added to describe how to create and run an EMR Streaming SQL node. EMR Streaming SQL nodes allow you to use SQL statements to develop streaming analytics jobs. | None |
2021.06.11 | New feature | DataStudio | A topic is added to describe how to create and run an EMR Spark Streaming SQL node. EMR Spark Streaming nodes can be used to process streaming data with high throughput. This type of node supports fault tolerance, which helps you restore data streams on which errors occur. | None |
2021.06.09 | New feature | Operation Center | A topic is added to describe how to view the basic information and status of real-time computing nodes on the Stream Task page in Operation Center in the DataWorks console. This allows you to monitor the status of the nodes. In addition, you can configure alert rules for the nodes that you want to monitor. This way, you can identify and handle exceptions at the earliest opportunity. |
Changes in May 2021
Date | Item | Category | Description | References |
2021.05.20 | New feature | Operation Center | A topic is added to describe how to create and manage a shift schedule in DataWorks. If you set the Recipient parameter to Varies According to Shift Schedule and select a shift schedule when you create a custom alert rule, DataWorks can send alert notifications to the on-duty engineers that you specify for the shift schedule. After the engineers receive the alert notifications, they can identify and handle exceptions at the earliest opportunity. | |
2021.05.17 | New feature | DataStudio | A topic is added to describe how to create and run ClickHouse SQL nodes. A ClickHouse SQL node allows you to use a distributed SQL query engine to process structured data. This improves execution efficiency of tasks. | |
2021.05.15 | New feature | Data Integration | Topics are added to describe how to synchronize data to AnalyticDB for MySQL 3.0 by using the Data Integration service of DataWorks. |
Changes in April 2021
Date | Item | Category | Description | References |
2021.04.29 | New tutorial | Getting started | A topic is added to provide AI tutorials that teach you how to develop nodes in DataWorks. | AI tutorials |
2021.04.28 | New feature | Data Integration | A topic is added to describe how to add source tables to or remove source tables from a synchronization task used to synchronize data to Hologres after the synchronization task is run. | Add or remove source tables to or from a synchronization task that is running |
2021.04.22 | New feature | DataStudio | A topic is added to describe how to create an FTP Check node that can be used to periodically detect whether a specific file exists based on FTP. If the FTP Check node detects that the file exists, the scheduling system starts to run the descendant node of the FTP Check node. Otherwise, the FTP Check node to detect the file based on the configured detection interval. The FTP Check node stops the retry until the condition to stop the detection is met. In most cases, FTP Check nodes are used for communications between the DataWorks scheduling system and external scheduling systems. | |
2021.04.06 | API operation | API operation | A topic is added to describe how to call the GetPermissionApplyOrderDetail operation that is provided by Security Center. | |
2021.4.05 | New feature | Data Integration | A topic is added to describe how to synchronize data to Kafka in real time by using the Data Integration service of DataWorks. |
Changes in March 2021
Date | Item | Category | Description | References |
2021.3.19 | New feature | Custom role | A topic is added to describe how to create a custom role in a DataWorks workspace. | |
2021.3.11 | New engine | Open source engines from which tasks are exported or migrated | Topics are updated to describe how to import tasks that are exported from the open source scheduling engines into DataWorks or migrate tasks from these scheduling engines to DataWorks. | |
2021.3.11 | New feature | Engine O&M | A topic is added to describe how to use the engine O&M feature of DataWorks to view the details of each EMR node and identify and remove the nodes that fail to run. This way, failed nodes do not affect the performance of descendant nodes. | |
2021.3.9 | New feature | Aggregate analysis of auto triggered nodes in a DAG | Topics are added to describe the aggregate view and the downstream and upstream analysis features of a DAG. You can view the details of an auto triggered node from the DAG and perform operations based on your business requirements. | |
2021.3.3 | New feature | API operation | Topics are added to describe API operations that are added for the Operation Center, Data Security Guard, and Migration Assistant services of DataWorks. |
Changes in February 2021
Date | Item | Category | Description | References |
2021.2.24 | New feature | Viewing of the status information about synchronization tasks | A topic is added to describe how to view the distribution and execution details of synchronization tasks that are run and how to handle synchronization tasks on which exceptions occur. This improves the O&M efficiency of synchronization tasks. | |
2021.2.9 | New feature | New real-time synchronization task | A topic is added to describe how to create a real-time synchronization task to synchronize data from a specific table and view the status of the synchronization task after the synchronization task is created. | None |
2021.2.6 | New feature | New real-time synchronization task | Topics are added to describe how to create a real-time synchronization task to synchronize data from specific or all tables in a database to MaxCompute, Hologres, or DataHub and view the status of the synchronization task after the synchronization task is created. | None |
2021.2.5 | New feature | New feature | A topic is added to describe how to add an ApsaraDB for OceanBase data source. You can configure a synchronization task for an ApsaraDB for OceanBase data source. | Add an ApsaraDB for OceanBase data source |
Changes in January 2021
Date | Item | Category | Description | References |
2021.1.28 | New feature | New node types supported by DataStudio | Topics are added to describe how to create and run a MySQL node and an AnalyticDB for MySQL node. You can use SQL statements to develop data for a MySQL data source and an AnalyticDB for MySQL data source. | |
2021.1.20 | New feature | New synchronization task | Topics are added to describe how to create a batch synchronization task and a real-time synchronization task to synchronize data from specific or all tables in a database to Elasticsearch and view the status of the synchronization tasks after the synchronization tasks are created. | |
2021.1.19 | New feature | Whitelist configuration and category management in Data Map | A topic is added to describe how to configure IP address whitelists for metadata collection and grant category management permissions. After you configure IP address whitelists, you can collect metadata and manage categories in Data Map. | Configure IP address whitelists for metadata collection |
2021.1.13 | New feature | Integration with ActionTrail | A topic is added to describe how to query DataWorks behavior events in ActionTrail. You can use the queried event details to perform behavior analysis, security analysis, resource change tracking, and compliance auditing. | |
2021.1.13 |
|
|
| |
2021.1.7 | New feature | New feature | A topic is added to describe how to synchronize data from a MySQL data source to Elasticsearch. You can learn how to prepare resource groups and data sources, create a synchronization task, and view the status of the synchronization task. | None |
Changes in December 2020
Date | Item | Category | Description | References |
2020.12.24 | New feature | New synchronization task | Topics are added to describe how to synchronize data from a PolarDB, Oracle, or MySQL data source to Hologres or MaxCompute. You can learn how to prepare resource groups and data sources, create a synchronization task, and view the status of the synchronization task and the answers to the FAQ. | None |
2020.12.14 | New feature | New feature | A topic is added to describe how to create a crawler and collect metadata from a Tablestore data source to DataWorks. You can view collected metadata on the Data Map page. |
Changes in November 2020
Date | Item | Category | Description | References |
2020.11.18 | New feature | API operation | A topic is added to describe how to call the CreateManualDag operation to trigger the running of a manually triggered workflow. | CreateManualDag |
2020.11.18 | New feature | API operation | A topic is added to describe how to call the GetManualDagInstances operation to query information about instances in a manually triggered workflow. | GetManualDagInstances |
2020.11.18 | New feature | API operation | A topic is added to describe how to query the details of a DAG based on the ID of the DAG. | |
2020.11.18 | New feature | API operation | A topic is added to describe how to call the SearchNodesByOutput operation to query a node based on the output. | SearchNodesByOutput |
2020.11.10 | New FAQ | User experience optimization | A topic is added to provide answers to frequently asked questions about Operation Center. | |
2020.11.02 | New feature | New feature | A topic is updated to describe the code review configuration in DataWorks. If you enable forcible code review, you must commit each node for the specific reviewer to review the code of the node. You can deploy the node only after the reviewer approves the code. |
Changes in October 2020
Date | Item | Category | Description | References |
2020.10.30 | API operation overview | User experience optimization | A topic is added to describe the applicable scopes, billing rules, and call limits of DataWorks API operations and describe the DataWorks API operations by function. | |
2020.10.28 | New feature | New feature | A topic is added to describe how to create an EMR table. | |
2020.10.28 | New feature | New feature | A topic is updated to describe how to associate an EMR compute engine with a DataWorks workspace. In DataWorks, you can create nodes such as Hive, MapReduce, Presto, and Spark SQL nodes based on an EMR compute engine and configure EMR workflows. You can also schedule the nodes and manage metadata. This can facilitate data output. | Associate an EMR cluster with a DataWorks workspace as an EMR compute engine instance |
Changes in September 2020
Date | Item | Category | Description | References |
2020.09.03 | Updates of billing method | Pricing | A topic is updated to describe the pay-as-you-go services and resources of DataWorks. The pay-as-you-go billing method allows you to use all the basic features of DataWorks in a cost-efficient manner. | |
2020.09.03 | Updated feature | User experience optimization | A topic is updated to provide basic information about DataWorks and to describe the features and limits of DataWorks. | |
2020.09.02 | New tutorial | User experience optimization | A topic is added to describe how to use DataWorks together with Machine Learning Platform for AI (PAI) to automatically identify users who steal electricity. This ensures that users use electricity in a safe manner. |
Changes in August 2020
Date | Item | Category | Description | References |
2020.08.07 | New data source | New feature | A topic is added to describe how to add a Hive data source. You can use Hive Reader and Hive Writer to read data from and write data to a Hive data source and can configure synchronization tasks for Hive data sources by using the codeless UI and code editor. | |
2020.08.07 | New data source | New feature | A topic is added to describe how to add a GBase8a data source. You can use GBase8a Reader and GBase8a Writer to read data from and write data to GBase8a data sources and can configure synchronization tasks for GBase8a data sources by using the codeless UI and code editor. | |
2020.08.07 | New data source | New feature | A topic is added to describe how to configure a Hologres data source. You can use Hologres Reader and Hologres Writer to read data from and write data to Hologres data sources and can configure synchronization tasks for Hologres data sources by using the codeless UI and code editor. | |
2020.08.07 | New data source | New feature | A topic is added to describe how to configure an HBase data source. You can use HBase Reader and HBase Writer to read data from and write data to HBase data sources and can configure synchronization tasks for HBase data sources by using the code editor. | |
2020.08.07 | New data source | New feature | A topic is added to describe how to add an Elasticsearch data source. You can use Elasticsearch Reader and Elasticsearch Writer to read data from and write data to Elasticsearch data sources and can configure synchronization tasks for Elasticsearch data sources by using the code editor. | |
2020.08.07 | New FAQ | User experience optimization | A topic is added to describe how to troubleshoot issues related to network connectivity, parameters, and permissions when you add data sources in DataWorks. | |
2020.08.07 | New feature | New feature | A topic is added to describe how to create an EMR Presto node. EMR Presto nodes allow you to perform interactive analysis and queries on large-scale structured and unstructured data. | |
2020.08.05 | New release notes of features | User experience optimization | A topic is added to describe the release notes of key features of DataWorks. | Announcements and updates |
Changes in June 2020
Date | Item | Category | Description | References |
2020.06.30 | New FAQ | User experience optimization | A topic is added to provide answers to frequently asked questions about DataWorks services and features. These services and features include Data Integration, DataStudio, custom resource groups, exclusive resource groups, dependencies, Alarm, and DataService Studio. | Overview |
2020.06.28 | New feature | New feature | A topic is added to describe how to add a route to a VPC or a data center. | |
2020.06.28 | New best practice | User experience optimization | A best practice is added to describe how to use exclusive resource groups for Data Integration to migrate data from a self-managed MySQL database hosted on an Elastic Compute Service (ECS) instance to MaxCompute. | Migrate data from a user-created MySQL database on an ECS instance to MaxCompute |
2020.06.28 | New best practice | User experience optimization | A best practice is added to describe how to use AIRec. AIRec is developed by Alibaba Cloud based on cutting-edge big data and AI technologies, and years of experience in the e-commerce industry. AIRec provides a personalized recommendation service to increase the customer purchase rate and order conversion rate. | |
2020.06.28 | New best practice | User experience optimization | A best practice is added to describe how to grant specific users access permissions on specific resources such as tables and user-defined functions (UDFs). This best practice involves data encryption and decryption algorithms that ensure data security. | Grant a specified user the access permissions on a specific UDF |
2020.06.28 | New best practice | User experience optimization | A best practice is added to describe how to build a data warehouse for an enterprise based on AnalyticDB for MySQL and use the data warehouse for O&M and metadata management. | Build a data warehouse for an enterprise based on AnalyticDB for MySQL |
2020.06.28 | New best practice | User experience optimization | A best practice is added to describe how to use a PyODPS node in DataWorks to segment Chinese text based on Jieba, an open source segmentation tool. The topic also describes how to write the segmented words and phrases to a new table and use closure functions to segment Chinese text based on a custom dictionary. | |
2020.06.28 | New best practice | User experience optimization | A best practice is added to describe how to use a PyODPS node that runs on an exclusive resource group to send emails. | |
2020.06.28 | New best practice | User experience optimization | A best practice is added to describe how to connect DataV to DataWorks DataService Studio. You can create APIs in DataService Studio and call the APIs in DataV. Then, DataV presents analysis results of the MaxCompute data. | Best practices to connect DataV to DataWorks DataService Studio |
2020.06.28 | New best practice | User experience optimization | A best practice is added to describe how to use a PyODPS node in DataWorks to reference a third-party package. A PyODPS 2 node is used as an example. | |
2020.06.28 | New best practice | User experience optimization | A best practice is added to describe how to enable automatic synchronization of IoT data to the cloud. IoT is a network that carries data based on the Internet and traditional telecommunication networks. IoT allows physical objects that can be independently addressed to be used as data sources. | |
2020.06.15 | New data source | New data source | A topic is added to describe how to add an ApsaraDB for OceanBase data source. You can use ApsaraDB for OceanBase Reader and ApsaraDB for OceanBase Writer to read data from and write data to an ApsaraDB for OceanBase data source and can configure synchronization tasks for ApsaraDB for OceanBase data sources by using the code editor. | Add an ApsaraDB for OceanBase data source |
2020.06.15 | New data source | New data source | A topic is added to describe how to add a Vertica data source. You can use Vertica Reader and Vertica Writer to read data from and write data to a Vertica data source. You can configure synchronization tasks for Vertica data sources by using the code editor. | |
2020.06.15 | New feature | New feature | A topic is added to describe the parameters that are supported by GBase8a Reader and how to configure GBase8a Reader by using the code editor. | |
2020.06.15 | New feature | New feature | A topic is added to describe Hologres Reader. Hologres Reader allows you to export data from Hologres data warehouses. You can read data from Hologres tables and then write the data to other data sources based on the standard protocol of Data Integration. | |
2020.06.15 | New feature | New feature | A topic is added to describe Hologres Writer. Hologres Writer allows you to import data from multiple data sources to Hologres for real-time data analysis. | |
2020.06.15 | New feature | New feature | A topic is added to describe how to configure a resource group for scheduling for a node in the Resource Group section of the Properties tab of the node. | |
2020.06.15 | New description | User experience optimization | A topic is added to describe the logic of scheduling dependencies. You must make sure that scheduling dependencies configured for a node are correct, which can result in an orderly workflow, ensure that business data is generated in an effective and timely manner, and standardize data development. | |
2020.06.15 | New resource group | New feature | A topic is added to describe how to create and use an exclusive resource group for scheduling, and associate an exclusive resource group for scheduling with a virtual private cloud (VPC) to enable the resource group to access data sources in the VPC. |
Changes in May 2020
Date | Item | Category | Description | References |
2020.05.27 | New usage notes | User experience optimization | DataWorks supports shared resource groups, exclusive resource groups, and custom resource groups. A topic is added to describe the scenarios and methods of using these resource groups. | |
2020.05.27 | New feature | New feature | A topic is added to describe how to manage report templates. You can create a template of data quality reports on the Report Template Management page. DataWorks Data Quality can periodically generate and send data quality reports based on the template. | |
2020.05.27 | New feature | New feature | A topic is added to describe how to manage rule templates. DataWorks Data Quality allows you to manage a set of custom rules and create a rule template library to configure rules in a more efficient manner. | |
2020.05.27 | New feature | New feature | A topic is added to describe the verification logic of Data Quality and the built-in rule templates that DataWorks provides for monitoring offline data. |
Changes in April 2020
Date | Item | Category | Description | References |
2020.04.19 | Service upgrade | DataWorks V3.0 | Topics are updated to describe how to use features provided in Operation Center. In Operation Center, you can view the dashboard, manage auto triggered nodes and manually triggered nodes, and monitor nodes. | |
2020.04.18 | Service upgrade | DataWorks V3.0 | Topics are updated to describe the overall process of how to build a MaxCompute data warehouse. | Build and optimize a data warehouse |
2020.04.18 | Service upgrade | DataWorks V3.0 | A topic is added to describe the Data Integration service of DataWorks. Data Integration is a stable, efficient, and scalable data synchronization service. Data Integration is designed to migrate and synchronize data between a wide range of heterogeneous data sources in complex network environments in a fast and stable manner. | |
2020.04.08 | Service upgrade | DataWorks V3.0 | A topic is updated to describe a complete process of data development and O&M. | |
2020.04.08 | Service upgrade | DataWorks V3.0 | Topic are updated to provide an overview of DataWorks, including the basic concepts, scenarios, and data development processes. |
Changes in March 2020
Date | Item | Category | Description | References |
2020.03.26 | New tutorial | User experience optimization | A tutorial is added to describe the complete operations in the DataWorks for EMR workshop. | |
2020.03.17 | Service upgrade | DataWorks V3.0 | A topic is updated to describe the upgraded data development mode. In the upgraded data development mode, you can group multiple workflows under a solution in a workspace. The previous hierarchical structure is no longer used. | |
2020.03.17 | Service upgrade | DataWorks V3.0 | Topics are updated to describe various types of nodes in DataWorks, such as batch synchronization nodes, ODPS nodes, EMR nodes, general nodes, and custom nodes. | |
2020.03.02 | Service upgrade | DataWorks V3.0 | Topics are updated to provide an overview of the DataWorks console. You can view the workspaces, resource groups, and compute engines in the DataWorks console. |
Changes in February 2020
Date | Item | Category | Description | References |
2020.02.29 | New best practice | User experience optimization | A best practice is added to describe how to use the data synchronization feature of DataWorks to migrate data from Oracle to MaxCompute. | |
2020.02.02 | New feature | New feature | Topics are added to describe how to use DataAnalysis. DataAnalysis allows you to collaboratively edit and analyze workbooks, manage MaxCompute tables in tabular mode, and generate and share visual reports. |