This topic describes the service positioning, audiences, and core capabilities of DataWorks.

Service positioning

DataWorks provides an end-to-end, standard, visualized, transparent, and intelligent cloud R&D platform that covers the full lifecycle of big data and provides individual development capabilities and full-stack data R&D capabilities for data developers, data analysts, and data asset managers. DataWorks allows you to use a single platform to perform operations in complex scenarios in which data transmission, data computing, data governance, and data sharing are required.

DataWorks is committed to developing features that meet the requirements of enterprises for building data warehouses and data mid-ends. DataWorks also provides support for the digital transformation of enterprise business.

Audiences

  • Technical personnel such as data developers and algorithm developers
  • Business personnel such as sales and operations personnel and business intelligence analysts
  • Administrators who are engaged in data security and data compliance
  • Data application developers
  • Managers who manage the core data assets of enterprises

Core capabilities

DataWorks provides the following core capabilities:
  • Data integration: supports data transmission and data migration to the cloud between various data sources that reside in complex network environments.
  • Data development: provides a data development mode in which the development environment and production environment are isolated. This feature allows you to develop nodes that use different compute engines and to configure complex scheduling dependencies for the nodes. The nodes include batch processing nodes, stream processing nodes, and Machine Learning Platform for AI (PAI) nodes.
  • Data analysis (available in DataWorks only on the Alibaba Cloud public cloud): allows you to perform quick and flexible ad hoc queries based on workbooks.
  • Data service: allows you to quickly generate serverless APIs and frees you from writing code during API generation.
  • Data quality: allows you to configure table-level or field-level monitoring rules to monitor data quality and helps you identify dirty data at the earliest opportunity.
  • Monitoring and alerting: allows you to easily configure monitoring and alerting settings for complex workflows.
  • Data map (available in DataWorks only on the Alibaba Cloud public cloud) or Data management (available only in Apsara Stack DataWorks): provides powerful capabilities such as data search, data categorization, and data lineage.
  • Data asset management (available only in Apsara Stack DataWorks): allows you to manage data assets such as data tables and APIs in DataWorks in a centralized manner.
  • Data security: provides features such as data masking and permission management.
  • Application development (available in DataWorks only on the Alibaba Cloud public cloud): allows you to easily build data applications by dragging components on the DataStudio page.
  • Workspace management (available in DataWorks only on the Alibaba Cloud public cloud) or Platform management (available only in Apsara Stack DataWorks): provides capabilities of managing the permissions of DataWorks users or members and the configurations of underlying compute engines for administrators at the system level.

You can use DataWorks to process and analyze massive data in offline mode. You can also use DataWorks to complete the best practices that cover the full lifecycle of big data. The best practices include data aggregation and integration, development, scheduling and O&M, online and offline analysis of data, data quality governance and asset management, security audit, data sharing and services, machine learning, and application building. DataWorks provides an end-to-end solution from data collection to data display and from data analysis to application driving and helps users apply data in business and present business status by using data.