DataWorks provides the Data Quality module for you to control the data quality of heterogeneous data stores. In Data Quality, you can check data quality, configure alert notifications, and manage connections.

Relying on DataWorks, Data Quality provides a comprehensive data quality scheme that has various features. For example, you can detect data, compare data, monitor data quality, scan SQL nodes, and use intelligent alerting.

Data Quality can monitor data processing throughout the process, detect issues based on monitoring rules, and send alert notifications to alert recipients in a timely manner.

Data Quality monitors data quality by dataset. Currently, it allows you to monitor data in E-MapReduce tables, AnalyticDB for PostgreSQL tables, MaxCompute tables, and Datahub topics. When offline data of E-MapReduce, AnalyticDB for PostgreSQL, and MaxCompute changes, Data Quality checks the data and blocks nodes that use the data if it detects anomalies. This prevents the nodes from being affected. Data Quality also allows you to manage the check result history so that you can analyze and evaluate the data quality.

For streaming data, Data Quality uses Datahub to monitor data streams and sends alert notifications to subscribers if it detects stream discontinuity. You can set the alert severity, such as warning and error alerts, and the alert frequency to minimize repeated alerts.

The following figure shows the data monitoring flowchart in Data Quality. Data monitoring flowchart
Note Data Quality monitors the quality of data in E-MapReduce tables, AnalyticDB for PostgreSQL tables, MaxCompute tables, and Datahub topics. To use Data Quality features, you need to create tables or topics and write data to the tables or topics.

You can create tables and topics and write data to them in the DataWorks console. You can also create MaxCompute tables and write data to the tables on the MaxCompute client.