All Products
Search
Document Center

:DataWorks V3.0

Last Updated:Mar 12, 2025

This topic describes the release notes and new features of DataWorks V3.0.

Release information

Version: DataWorks V3.0

  • Date: December 18, 2019

  • Region: all regions that support DataWorks

  • Content: In addition to MaxCompute compute engines supported by DataWorks V2.0, DataWorks V3.0 integrates many other types of compute engines, such as E-MapReduce (EMR), Hologres, to build a multi-engine architecture.

    A workspace supports multiple compute engine instances, which allow you to manage your workflows, tasks, and tables in a workspace in a centralized manner.

Key features

  • Multiple types of compute engines

    DataWorks V3.0 upgrades plug-ins of multiple types of compute engines. In addition to MaxCompute compute engines supported by DataWorks V2.0, DataWorks V3.0 integrates many other types of compute engines such as EMR, Hologres.

    • MaxCompute: MaxCompute is an efficient and fully managed computing platform for large-scale data warehousing. It supports the processing of exabytes of data. MaxCompute is the first and maturest compute engine supported by DataWorks. Almost all features of MaxCompute have been seamlessly integrated into DataWorks. For more information, see What is MaxCompute?

    • EMR: EMR is a big data engine that runs on Alibaba Cloud Elastic Compute Service (ECS) based on open source Apache Hadoop and Apache Spark. You can analyze and process your data by using peripheral systems such as Apache Hive in the Hadoop and Spark ecosystems.

      All services and features of DataWorks V3.0, such as metadata management, Data Map, data lineage, DataStudio, task scheduling, task O&M and monitoring, and Data Quality, support EMR. For information about EMR, see What is EMR on ECS?

    • Hologres: Hologres is a real-time interactive data analysis service that is fully compatible with PostgreSQL and seamlessly integrated with the Alibaba Cloud big data ecosystem.

      You can use Hologres to gain an analytical insight into thousands of billions of data records from multiple dimensions with high concurrency and low latency and explore new business opportunities. You can also use your business intelligence (BI) tools together with Hologres.

      DataWorks V3.0 provides HoloStudio. HoloStudio is an end-to-end online analytical processing (OLAP) service that provides standardized and easy-to-use development and management services and real-time data warehousing services. This contributes to effective and simple development.

  • Multiple compute engine instances in a workspace

    In DataWorks V2.0, you can configure only one compute engine instance for a workspace. For example, if you use a workspace of DataWorks V2.0, you can create only one MaxCompute project for the workspace. In DataWorks V3.0, you can configure multiple compute engine instances of each compute engine type for a workspace that is created in DataWorks Professional Edition or a more advanced edition. You can manage objects such as the compute engines, computing tasks, and data tables that are required by your business in a flexible and unified manner.

  • Resource group orchestration

    DataWorks V3.0 will support resource group orchestration. This feature allows you to quickly configure and change resource groups for multiple tasks at a time. For example, you can change from exclusive resource groups to a serverless resource group for multiple tasks at a time.

  • Data import and export in a workspace

    DataWorks V2.0 supports data backup and recovery in workspaces. DataWorks V3.0 upgrades this feature and makes it more flexible. You can import or export nodes, tasks, table DDL statements, resources, functions, and data sources to or from workspaces. This facilitates data migration from workspaces and workspace initialization.