To guarantee data quality, you can verify data during data generation in your business system.

Verify SQL statements before committing them

Before you commit SQL statements for generating data, you can manually verify the SQL statements or configure the system to verify the SQL statements.

If you enter an SQL statement with a syntax error in the SQL editor in DataWorks DataStudio, the syntax error is underlined with a red wavy line.

Test nodes before deploying them

To guarantee the accuracy of online data, you must test each updated node before you deploy it to the production environment. The node can be deployed only after it passes the test. You can conduct code review or regression testing on an updated node before you deploy it. For an application with a high data class, you must complete regression testing on each updated node before you deploy the node. In this tutorial, the data class of the application is A2, which is high.

You must conduct regression testing in an environment that is exactly similar to the actual environment.
  • In a workspace in the standard mode, you can execute SQL statements to copy data from the production environment to the development environment, run the related workflow, and then check for errors.
  • In a workspace in the basic mode, you can directly run the related workflow and check for errors.
In this tutorial, a workspace in the basic mode is used. You can directly run the related workflow.

If all nodes in the workflow are marked with a green check sign (√), the workflow is run successfully.

Notify relevant persons of data changes

Before updating data, you must notify owners of data involved in descendant nodes. You must tell them the reason, logic, and time of the update. You can implement the update only after the owners confirm the update. In this tutorial, if the table schema in Tablestore changes, you must ask owners of the ots_user_trace_log, ods_user_trace_log, dw_user_trace_log, and rpt_user_trace_log tables to update the table schema in time.