All Products
Document Center

What is DataWorks

Last Updated: Jun 26, 2018

The DataWorks is an important Platform as a service (PaaS) product in the Alibaba Cloud. It offers fully hosted workflow services and a one-stop development and management interface to help enterprises mine and comprehensively explore the value of their data.

DataWorks uses MaxCompute as its core computing and storage engine to provide massive offline data processing, analysis, and mining capabilities. For more information, see MaxCompute overview.

DataWorks is a big data PaaS platform released by Alibaba Cloud. As a one-stop DW capability platform, it offers a wide-range of products and services, including data integration, data development, data management, and data governance.

DataWorks makes data transmission and conversion a lot more easier. It allows you to perform further data operations. You can import data from different storage services, and convert and ultimately extract the data to other data systems. See the following figure to have a complete insight about the data analysis.


Function overview

Fully-hosted scheduling

DataWorks provides powerful scheduling capabilities. Based on DAG relationships, the time-based or dependency-based tasks trigger configurations to perform tens of millions of tasks on time with maximum accuracy each day. The multiple scheduling frequency configurations are supported by minute-to-minute, hourly, daily, weekly, and monthly basis.

The fully-hosted service eliminates all your concerns about scheduling server resources. The system isolates different tenants that guarantees the tasks run independently.

Supports various task types

DataWorks supports multiple task types, such as data synchronization, SHELL, MaxCompute SQL, and MaxCompute MR tasks. The dependencies between tasks form complex data analysis processes.

  • Powered by MaxCompute, DataWorks provides powerful data conversion capabilities to guarantee high performance of big data analysis. For more information, see MaxCompute overview.

  • For data synchronization, DataWorks relies on DataWorks’ powerful data integration capabilities to support over 20 data sources and provide stable and a highly-efficient data transmission. For more information, see Data integration overview.

Visual development

This product offers visual code development and workflow designer pages. Without additional development tools, you can drag and drop components to develop complex data analysis tasks. A browser with Internet connection alone equips you to carry out development tasks wherever you are.

Monitoring and alarms

The O&M center provides visual task monitoring and management tools, and displays global conditions in DAG format when tasks are running.

The alarm service provides the monitoring alarm capability allowing you to obtain up-to-date metric data for troubleshooting any cloud product abnormality in a timely manner. You can create alarm rules and add an alarm contact and alarm contact group.

You must provide the contact and contact group information, which is a prerequisite for the alarm rule function. This is because when any exception occurs, an alarm is triggered and the alarm notification is sent to the alarm contact and the alarm contact group. The alarm notifications can be sent through text message, an email or TradeManager. Hence, it is required to create a contact and a contact group when you begin to use the alarm rules function for the first time.

Constraints and limitations

  • DataWorks only supports Chrome 54 or later.

  • Currently, DataWorks only supports SQL operations on MaxCompute, instead of Alibaba Cloud ApsaraDB or Analytic DB.