Impacts of removing or changing the output of a node - DataWorks

This topic describes the operations that lead to the removal or change of the output of a node. This topic also describes the impacts of the removal or change and the solutions to related issues.

Precautions

If a node has descendant nodes, and you remove the output of the node, the following issues may occur:

If the descendant nodes depend only on the current node, the descendant nodes may become isolated nodes and cannot be scheduled as expected. For information about isolated nodes, see Isolated nodes.
If you remove the output of the node before the output table of the node is generated, data quality issues may occur.

We recommend that you evaluate the impacts of changing the output of the node on your business and proceed with caution.

Operations that lead to the removal or change of the output of a node

When you create a node, the system automatically generates two outputs for the node. You can add an output to the node or enable the automatic parsing feature to allow the system to add the table that is generated by the node as the output of the node.

If you perform the following operations, the output of a node may be removed:

Manually remove the output of a node.
Disable the automatic parsing feature and do not check the output of a node. In this case, if the code of the node changes, the node no longer generates a specific table that corresponds to the removed output.

Note If a node has descendant nodes, and you change the output of the node, severe impacts may exist.

Impacts of removing or changing the output of a node on descendant nodes and the solutions to related issues

Note Removing the output of a node does not affect the output and table data generation of the node. The output of a node is used only to form scheduling dependencies between the node and other nodes. Regardless of whether the output of the node is removed, the logic of the node code determines whether a node generates table data.

If you remove or change the output of a node, the descendant nodes of the node cannot be scheduled as expected, or data quality issues occur on the descendant nodes. If the descendant nodes have strong dependencies on the node, severe impacts may exist. If you remove the output of a node that has descendant nodes, the following message appears in the DataWorks console. Check the impacts on your business and proceed with caution.

If the descendant nodes depend only on the current node, the descendant nodes may become isolated nodes. Isolated nodes cannot be scheduled. For information about isolated nodes, see Isolated nodes.
If the descendant nodes depend on multiple nodes, data quality issues may occur.
If the node has descendant nodes, and you remove or change the output of the node, the descendant nodes may be severely affected. The descendant nodes may fail to run or data quality issues may occur. Note that the output of a node automatically changes when the table that is generated by the node is changed.
If the node that generates table data changes, you must configure new dependencies for the descendant nodes of the node.
You can use the automatic parsing feature to obtain the scheduling dependencies of a node based on the lineage in the code of the node, and configure the scheduling dependencies for the node. If a node generates Table A and a descendant node B depends on Table A, the system adds Table A to Output for the current node. The ID and name of descendant node B are also displayed. If the current node no longer generates Table A due to business changes, you must identify the node from which Table A is generated and configure dependencies between descendant node B and the node that generates Table A.

Example: What do I do if the table that is generated by a node changes and the dependencies for descendant nodes become invalid?

In this example, the Node_A node generates Table A, and the descendant node Node_B needs to process the data of Table A. The system configures dependencies between Node_A and Node_B by using the automatic parsing feature. If business changes cause Table A to be generated by another node Node_C, the system configures dependencies between Node_C and Node_B.

The system configures dependencies between Node_A and Node_B by using the automatic parsing feature. This way, ancestor and descendant nodes are scheduled as expected.
Node_A no longer generates Table A due to business changes. In this case, the dependencies that are configured by using the automatic parsing feature between Node_A and Node_B become invalid.
Node_B depends only on Node_A. As a result, the dependencies for Node_B become invalid. In this case, Node_B becomes an isolated node and cannot be scheduled as expected.
You can add Table A as the output of Node_C. This way, the system configures dependencies between Node_C and Node_B by using the automatic parsing feature.
After you add Table A as the output of Node_C, commit and deploy Node_C to ensure that dependencies between Node_C and Node_B can be configured.