This topic introduces Alibaba Cloud DataWorks, including its features and limits.

DataWorks is an important platform as a service (PaaS) of Alibaba Cloud. It offers all-around services, including Data Integration, DataStudio, Data Map, Data Quality, and DataService Studio. In addition, it provides a one-stop data development and management console to help enterprises mine and explore data value.

DataWorks supports multiple computing and storage engines, including MaxCompute, E-MapReduce, Realtime Compute, Machine Learning Platform for AI, Graph Compute, and Hologres. It also allows you to use custom computing and storage services. As a one-stop platform, DataWorks provides end-to-end big data services, artificial intelligence (AI) development, and data governance.

DataWorks simplifies data transmission, conversion, and integration. You can import data from different data stores, convert, analyze, and process the data, and then transmit the data to other data systems.Architecture

Features

  • DataWorks is a cloud-hosted environment.
    • DataWorks provides powerful scheduling capabilities. For more information, see Schedule.
      • In DataWorks, nodes can be triggered by time- or dependency-based scheduling configuration. For more information, see Scheduling properties and Dependencies.
      • DataWorks enables tens of millions of nodes to run accurately and on time every day based on node relationships in directed acyclic graphs (DAGs).
      • DataWorks supports running nodes at custom intervals in minutes, hours, days, weeks, or months.
    • DataWorks is a cloud-hosted environment that frees you from server deployment.
    • DataWorks provides the isolation feature to guarantee that nodes of different tenants do not affect each other.
  • DataWorks supports multiple node types, such as Batch Sync node, Shell node, ODPS SQL node, and ODPS MR node. It analyzes and processes complex data based on the dependencies between nodes.
    • Data conversion: By using the powerful computing capabilities of MaxCompute, DataWorks guarantees the superior performance on analyzing and processing big data.
    • Data synchronization: With the strong support of Data Integration, DataWorks supports more than 20 types of connections and provides stable and efficient data transmission features.
  • DataWorks provides visualized code development.

    DataWorks provides visualized code development and workflow designer pages. You can develop complex data analytics nodes through simple drag-and-drop operations without using any development tools. For more information, see GUI elements. A browser with Internet access enables you to develop code anytime, anywhere.

  • DataWorks supports monitoring and alerting.

    Operation Center provides a visualized node monitoring and management tool and displays the overall node running status in DAGs.

    You can easily configure various alert notification methods to promptly notify relevant staff when a node error occurs. This guarantees normal business operation.

Limits

  • DataWorks only supports Google Chrome 54 or later.
  • Currently, DataWorks only supports MaxCompute SQL operations.

Upgrade

DataWorks upgrade will be applied automatically and there will be no impact to current users.