This module introduces you to the design ideas and core capabilities of DataWorks, to help you gain insight into the ideas and capabilities of Alibaba Cloud DataWorks.

Course Overview

Course duration: Two hours, using an online learning method.

Course object: for all new and old users of DataWorks, such as Java engineer, product operation, HR, etc, as long as you are familiar with standard SQL, you can quickly master the basic skills of DataWorks, you don't need to know much about the principles of data warehouses and MaxCompute. However, it is also recommended that you further study the DataWorks course to gain insight into the basic concepts and functions of DataWorks.

Course objective: Take the common real-world massive log data analysis task as the curriculum background, after completing the course, you will be able to understand the main features of DataWorks, able to demonstrate content according to the course, independently complete data acquisition, data development, task operations and other data jobs common tasks.

This course includes the following:
  • Product introduction: You will learn about DataWorks' development history, its overall architecture, and its modules and their relationships.
  • Data Acquisition: Learn How to synchronize data from different data sources to MaxCompute, how to quickly trigger task runs, how to view task logs, and so on.
  • Data Processing: learn how to run a data flow chart, how to create a new data table, how to create a data process task node, how to configure periodic scheduling properties for tasks.
  • Data quality: Learn how to configure monitoring rules for data quality for tasks, ensure that the task runs quality issues.

DataWorks introduction

DataWorks is a big data research and development platform, using MaxCompute as the main calculation engine, including data integration, data modeling, data development, operations and operations monitoring, data management, data security, data quality, and other product functions. At the same time, with the algorithm platform PAI to get through, complete link from big data development to Data Mining and machine learning.

Data Collection

For more information on data acquisition, see Data acquisition: Log data upload.

Data Processing

For details on data processing, see Data processing: user portraits.

Data quality

For more information on data quality, see Data quality monitoring.

Learning to answer questions

If you encounter problems in the learning process, you can add DingTalk groups: 11718465, consulting Alibaba cloud technical support.