Retroactive instances refer to instances that are run to generate retroactive data for auto triggered nodes. You can manage retroactive instances. For example, you can view the running status of instances, and stop, rerun, or unfreeze instances.

Limits

  • When DataWorks generates retroactive data in a period of time for a node, if one instance of the node fails on a day, the retroactive instance for that day is also set to Failed. DataWorks will not run instances of this node for the next day. To sum up, DataWorks runs instances of a node on a day only when all its instances of the previous day are successful.
  • For a self-dependent auto triggered node, if the first instance for which retroactive data needs to be generated has a last-cycle instance on the previous day but the last-cycle instance is not run, the retroactive instance cannot be triggered. If the first instance for which retroactive data needs to be generated does not have a last-cycle instance on the previous day, the retroactive instance is directly triggered.
  • DataWorks generates alerts only for auto triggered node instances that fail.
  • If an auto triggered node has an instance in the Running state, its retroactive or test instance can start to run only after this auto triggered node instance is run.
  • If both an auto triggered node instance and a retroactive instance are running for a node, you must stop the retroactive instance to ensure that the auto triggered node instance can be run as expected.

Generate retroactive data

  1. Log on to the DataWorks console.
  2. In the left-side navigation pane, click Workspaces.
  3. Find the required workspace and click Data Analytics.
  4. Click the Icon icon in the upper-left corner and choose All Products > Operation Center.
  5. In the left-side navigation pane, choose Cycle Task Maintenance > Cycle Task.
  6. Click the rightward arrow in the middle of the page to show the node list. Find the required node, click Patch Data, and then select a mode for generating retroactive data.

    You can also right-click the node in the directed acyclic graph (DAG), move the pointer over Run, and then select a mode for generating retroactive data.

Generate retroactive data for the current node

  1. Find the required node and choose Patch Data > Current Node Retroactively.
  2. In the Patch Data dialog box, set the parameters as required.
    Parameter Description
    Retroactive Instance Name DataWorks automatically generates a retroactive instance name for your node. You can modify the name.
    Data Timestamp The data timestamp of the retroactive instance.
    Node The name of the node for which you want to generate retroactive data. You cannot change the node.
    Parallelism Specifies whether to generate multiple retroactive instances at a time.
    • If you select Disable, only one retroactive instance is generated. The retroactive instance is run multiple times based on the data timestamp in sequence.
    • If you select 2 Parallel Groups, 3 Parallel Groups, 4 Parallel Groups, or 5 Parallel Groups, multiple retroactive instances are generated.
      The retroactive instances are run based on the data timestamp in parallel.
      • If the number of days in the data timestamp is smaller than the number of parallel groups, the retroactive instances are run in parallel. For example, the data timestamp is from January 11 to January 13, and you select 4 Parallel Groups. In this case, three retroactive instances are generated for each day in the data timestamp, and are run in parallel.
      • If the number of days in the data timestamp is greater than the number of parallel groups, some instances must be run multiple times in sequence whereas others are run in parallel. For example, the data timestamp is from January 11 to January 13, and you select 2 Parallel Groups. In this case, two retroactive instances are generated. They are run in parallel for once, and one of them must be run for a second time.
  3. Click OK.

Generate retroactive data for the current and descendant nodes

  1. Find the required node and choose Patch Data > Current and Descendent Nodes Retroactively.
  2. In the Patch Data dialog box, set the parameters as required, including Nodes.
    Parameter Description
    Retroactive Instance Name DataWorks automatically generates a retroactive instance name for your node. You can modify the name.
    Data Timestamp The data timestamp of the retroactive instance.
    Parallelism Specifies whether to generate multiple retroactive instances at a time.
    • If you select Disable, only one retroactive instance is generated.
    • If you select 2 Parallel Groups, 3 Parallel Groups, 4 Parallel Groups, or 5 Parallel Groups, multiple retroactive instances are generated.
    Nodes You can set the Node Name and Node Type parameters to filter and select nodes for which you want to generate retroactive data.
  3. Click OK.

Generate retroactive data for a large number of nodes

  1. Find the required node and choose Patch Data > Mass Nodes Retroactively.
  2. In the Patch Data dialog box, set the parameters as required.Patch Data
    Parameter Description
    Retroactive Instance Name DataWorks automatically generates a retroactive instance name for your node. You can modify the name.
    Data Timestamp The data timestamp of the retroactive instance.
    Note We recommend that you do not set this parameter to a long range. Otherwise, the retroactive instance may be delayed due to insufficient resources.
    Parallelism Specifies whether to generate multiple retroactive instances at a time.
    • If you select Disable, only one retroactive instance is generated.
    • If you select 2 Parallel Groups, 3 Parallel Groups, 4 Parallel Groups, or 5 Parallel Groups, multiple retroactive instances are generated.
    Nodes
    • If you select Current Node, retroactive instances are generated for the current node and its descendant nodes.
    • If you clear Current Node, a dry-run instance is generated for the current node and retroactive instances are generated for its descendant nodes.
    Workspaces You can select workspaces in the Available Workspaces section and add them to the Selected Workspaces section. Fuzzy match is supported when you search for workspaces in the Available Workspaces section.
    Node Whitelist You can add the nodes outside the selected workspaces, for which you want to generate retroactive data.
    Note You can search for nodes only by node ID.
    Node Blacklist You can add the nodes inside the selected workspaces, for which you do not want to generate retroactive data.
    Note You can search for nodes only by node ID.
  3. Click OK.

Generate retroactive data for a specific node in a node group

Workflows that are created in DataWorks V1.0 are automatically converted to node groups in Operation Center of DataWorks V2.0. To generate retroactive data for a specific node in a node group, perform the following steps:
  1. In the left-side navigation pane of the Operation Center page, choose Cycle Task Maintenance > Cycle Task. Open the DAG of the required node.
  2. Right-click a node group and select View Internal Nodes.
  3. On the page that appears, select the topmost ancestor node of the node for which you want to generate retroactive data. Then, click the Copy icon next to Node ID in the lower-right corner.
  4. Return to the Cycle Task page and enter the copied node ID to search for the node.
  5. Open the DAG of the node that is found, right-click the node, and then choose Run > Current and Descendent Nodes Retroactively.
  6. Select the specific node for which you want to generate retroactive data in the node group.Select a specific node
Note You can search for an inner node based on a node group, but not reversely.

Instance list

Operation Description
Filter You can find the required instances by setting the filter conditions in the section marked with 1 in the preceding figure.

You can search for instances by node name or node ID and set the parameters such as Retroactive Instance Name, Node Type, Owner, Run At, Data Timestamp, Region, Engine type, Engine instance, Baseline, and My Nodes to filter instances.

Note By default, the data timestamp is set to the previous day of the current day.
DAG You can open the DAG of the current node to view the running results of the instances.
Stop Stop the instance. You can stop an instance only when it is in the Pending or Running state. After you perform this operation, the instance enters the Failed state.
Rerun Reschedule the instance.
Rerun Descendent Nodes Rerun the descendant nodes of the current node.
Freeze Freeze the current node and pause the scheduling of the node.
Unfreeze Resume the scheduling of the frozen node.
View Lineage Allow you to view the lineage of the node.

Retroactive instance in a DAG

Click the name of an instance or DAG in the Actions column to open the DAG of the instance. In the DAG, you can right-click the instance to perform related operations.

Note After you click the Refresh icon in the upper-right corner, only the DAG of the instance is refreshed, but not the operational logs of the instance.
Operation Description
Show Ancestor Nodes or Show Descendent Nodes If a workflow contains three or more nodes, specific nodes are automatically hidden in the DAG in Operation Center. You can select the number of levels to view all nodes at one or more levels.
View Runtime Log Allow you to view the operational logs of the current instance if it is in the Running, Successful, or Failed state.
View Code Allow you to view the code of the current instance.
Edit Node Click this menu item to go to the DataStudio page to modify the current node.
View Lineage Allow you to view the lineage of the current instance.
Stop Stop the instance. You can stop an instance only when it is in the Pending or Running state. After you perform this operation, the instance enters the Failed state.
Rerun Rerun the instance if it is in the Failed state or an abnormal state.
Rerun Descendent Nodes Rerun all the descendant instances of the current node. If multiple descendant instances exist, all these instances are rerun.
Set Status to Successful Set the status of the current instance to Successful and run its pending descendant instances. Perform this operation if an instance fails.
Note Only the status of a failed instance can be set to Successful. This operation does not apply to workflows.
Emergency Operations Perform emergency operations in emergency only. Emergency operations take effect only on the current node for one time.

Select Delete Dependencies to delete the dependencies of the current node. Perform this operation so that you can start the current node when the ancestor instances fail and the current instance does not depend on the data of the ancestor instances.

Freeze Freeze the current node and pause the scheduling of the node.
Unfreeze Resume the scheduling of the frozen node.

Instance states

No. State Icon
1 Successful 1
2 Pending (Ancestor) 2
3 Failed 3
4 Running 4
5 Pending (Schedule) 5
6 Freeze 6