What is DataWorks

Last Updated: Dec 05, 2017

The DataWorks is an important PaaS product in the Alibaba Cloud, and it is also a core component of DataWorks. It provides fully hosted workflow services and a one-stop development and management interface to help enterprises mine and explore the full value of their data.

DataWorks uses MaxCompute as its core computing and storage engine, to provide massive offline data processing, analysis, and mining capabilities. For more information, see MaxCompute overview.

DataWorks is a big data PaaS platform released by Alibaba Cloud. As a one-stop DW capability platform, it provides a comprehensive range of products and services, including data integration, data development, data management, and data governance.

With DataWorks, you can transmit, convert and otherwise work with data. This allows you to import data from different storage services, and convert and ultimately extract the data to other data systems. A complete data analysis process is shown in the following figure:

Architecture

Function overview

Fully-hosted scheduling

DataWorks provides powerful scheduling capabilities including time-based or dependency-based task trigger mechanisms to perform tens of millions of tasks accurately and punctually each day based on DAG relationships. It supports multiple scheduling frequency configurations, by minute, hour, day, week, and month.

The fully-hosted service removes your worry about scheduling server resources. The system isolates different tenants, ensuring that their tasks do not interfere with each other.

Supports various task types

DataWorks supports multiple task types, including data synchronization, SHELL, MaxCompute SQL, and MaxCompute MR tasks. The dependencies between tasks form complex data analysis processes.

  • Powered by MaxCompute, DataWorks provides powerful data conversion capabilities to ensure the high performance of big data analysis. For more information, see MaxCompute overview.

  • For data synchronization, DataWorks relies on DataWorks’ powerful data integration capabilities to support over 20 data sources and provide stable and highly-efficient data transmission. For more details, see Data integration overview.

Visual development

This product provides visual code development and workflow designer pages. Without additional development tools, you can drag and drop components to develop complex data analysis tasks. A browser with Internet connection alone enables you to carry out development tasks wherever you are.

Monitoring and alarms

The O&M center provides visual task monitoring and management tools, and displays global conditions in DAG format when tasks are running.

You can easily configure text message alarms, so that the relevant staff will be notified as soon as a task error occurs. This ensures the smooth operation of your business.

Constraints and limitations

  • Only supports Chrome 54 or later.
  • Currently, DataWorks only supports SQL operations on MaxCompute, instead of Alibaba Cloud ApsaraDB or Analytic DB.
Thank you! We've received your feedback.