This topic introduces Alibaba Cloud DataWorks, including its features and limits.

DataWorks is an important platform as a service (PaaS) of Alibaba Cloud. It offers all-around services, including Data Integration, DataStudio, Data Map, Data Quality, and DataService Studio. In addition, it provides an all-in-one data development and management console to help enterprises mine and explore data value.

DataWorks supports multiple compute and storage engines, including MaxCompute, E-MapReduce, Realtime Compute, Machine Learning Platform for AI, Graph Compute, and Hologres. It also allows you to use custom computing and storage services. As an all-in-one platform, DataWorks provides end-to-end big data services, artificial intelligence (AI) development, and data governance.

DataWorks simplifies data transmission, conversion, and integration. You can import data from different data stores, convert, analyze, and process the data, and then transmit the data to other data systems.Architecture

Limits

DataWorks supports only Google Chrome 54 or later.

Features

  • DataWorks is a cloud-hosted environment.
    • DataWorks provides powerful scheduling capabilities. For more information, see Schedule.
      • In DataWorks, nodes can be triggered by time- or dependency-based scheduling configuration. For more information, see Scheduling properties and Dependencies.
      • DataWorks enables tens of millions of nodes to run accurately and on time every day based on node relationships in directed acyclic graphs (DAGs).
      • DataWorks supports running nodes at custom intervals in minutes, hours, days, weeks, or months.
    • DataWorks is a cloud-hosted environment that frees you from server deployment.
    • DataWorks provides the isolation feature to ensure that nodes of different tenants do not affect each other.
  • DataWorks supports multiple node types, such as batch sync node, Shell node, ODPS SQL node, and ODPS MR node. It analyzes and processes complex data based on the dependencies between nodes.
    • Data conversion: By using the powerful computing capabilities of MaxCompute, DataWorks ensures the superior performance on analyzing and processing big data.
    • Data synchronization: Based on the Data Integration service, DataWorks supports more than 20 types of data stores and provides stable and efficient data transmission features. For more information, see Data Integration and Supported data stores and plug-ins.
  • DataWorks provides visualized code development.

    DataWorks provides a graphical user interface (GUI) for you to develop code and design workflows. You can perform simple drag-and-drop operations to create complex data analytics nodes without using any development tools. For more information, see GUI elements.

    A browser with Internet access enables you to develop code anytime, anywhere.

  • DataWorks supports monitoring and alerting.

    Operation Center provides a visualized node monitoring and management tool and displays the overall node running status in DAGs. For more information, see Operation Center.

    You can configure various alert notification methods to promptly notify relevant staff when a node error occurs. This ensures normal business operation. For more information, see Monitor.