Retroactive instances are generated when DataWorks generates retroactive data for auto triggered nodes. You can manage retroactive instances. For example, you can view the running status of instances, and stop, rerun, or unfreeze instances.

Limits

  • When you generate retroactive data in a period of time for a node, if one instance of the node fails on the first day, the retroactive instance for that day is also set to Failed. DataWorks will not run instances of this node for the second day, unless all its instances of the previous day are successful.
  • For a self-dependent auto triggered node, if the first instance for which retroactive data needs to be generated has a last-cycle instance on the previous day but the last-cycle instance is not run, the retroactive instance cannot be triggered. If the first instance for which retroactive data needs to be generated does not have a last-cycle instance on the previous day, the retroactive instance is directly triggered.
  • Currently, DataWorks generates alerts only for auto triggered node instances. DataWorks does not generate alerts for manually triggered node instances, retroactive instances, or test instances.
  • If a node has an auto triggered node instance in the Running state, its retroactive or test instance can start to run only after this auto triggered node instance is run.
  • If both an auto triggered node instance and a retroactive instance are running for a node at the same time, you must stop the retroactive instance to guarantee proper running of the auto triggered node instance.

Access the menu for generating retroactive data

  1. Log on to the DataWorks console. In the left-side navigation pane, click Workspaces. On the Workspaces page, find the target workspace and click Data Analytics in the Actions column.
  2. On the DataStudio page that appears, click the icon in the upper-left corner and choose All Products > Operation Center.
  3. In the left-side navigation pane, choose Cycle Task Maintenance > Cycle Task.
  4. On the page that appears, click the rightward arrow in the middle to show the Actions column in the node list. Find the target node, click Patch Data in the Actions column, and then select a mode for generating retroactive data.

    You can also right-click the target node in the directed acyclic graph (DAG), move the pointer over Run, and then select a mode for generating retroactive data.

Generate retroactive data for the current node

  1. Find the target node, click Patch Data in the Actions column, and then select Current Node Retroactively.
  2. In the Patch Data dialog box that appears, set the parameters.
    Parameter Description
    Retroactive Instance Name The name of the retroactive instance. DataWorks automatically generates a name, which can be modified.
    Data Timestamp The data timestamp of the retroactive instance.
    Node The name of the node for which you want to generate retroactive data, which cannot be modified.
    Parallelism Specifies whether to generate multiple retroactive instances at a time.
    • If you select Disable, only one retroactive instance is generated.
    • If you select 2 Parallel Groups, 3 Parallel Groups, 4 Parallel Groups, or 5 Parallel Groups, multiple retroactive instances are generated.
    Note
    • Disable: One retroactive instance is run multiple times based on the data timestamp in sequence.
    • Other options: Multiple retroactive instances are run based on the data timestamp in parallel or in sequence.
      • If the number of days in the data timestamp is smaller than the number of parallel groups, the retroactive instances are run in parallel. For example, the data timestamp is from January 11 to January 13, and you select four parallel groups. In this case, three retroactive instances are generated corresponding to each day in the data timestamp, and the three retroactive instances are run in parallel.
      • If the number of days in the data timestamp is larger than the number of parallel groups, some instances must be run multiple times in sequence while others are run in parallel. For example, the data timestamp is from January 11 to January 13, and you select two parallel groups. In this case, two retroactive instances are generated. They are run in parallel for once, and one of them must be run for a second time.
  3. Click OK.

Generate retroactive data for the current and descendant nodes

  1. Find the target node, click Patch Data in the Actions column, and then select Current and Descendant Nodes Retroactively.
  2. In the Patch Data dialog box that appears, set the parameters, including Nodes.
    Parameter Description
    Retroactive Instance Name The name of the retroactive instance. DataWorks automatically generates a name, which can be modified.
    Data Timestamp The data timestamp of the retroactive instance.
    Parallelism Specifies whether to generate multiple retroactive instances at a time.
    • If you select Disable, only one retroactive instance is generated.
    • If you select 2 Parallel Groups, 3 Parallel Groups, 4 Parallel Groups, or 5 Parallel Groups, multiple retroactive instances are generated.
    Nodes The nodes for which you want to generate retroactive data. You can set Node Name and Node Type to filter nodes.
  3. Click OK.

Generate retroactive data for large amounts of nodes

  1. Find the target node, click Patch Data in the Actions column, and then select Mass Nodes Retroactively.
  2. In the Patch Data dialog box that appears, set the parameters.Generate retroactive data for large amounts of nodes
    Parameter Description
    Retroactive Instance Name The name of the retroactive instance. DataWorks automatically generates a name, which can be modified.
    Data Timestamp The data timestamp of the retroactive instance.
    Note We recommend that you do not set this parameter to a long range. Otherwise, the retroactive instance may be delayed due to insufficient resources.
    Parallelism Specifies whether to generate multiple retroactive instances at a time.
    • If you select Disable, only one retroactive instance is generated.
    • If you select 2 Parallel Groups, 3 Parallel Groups, 4 Parallel Groups, or 5 Parallel Groups, multiple retroactive instances are generated.
    Nodes
    • If you select the Current Node synchronization check box, retroactive instances are generated for the current node and its descendant nodes.
    • If you clear the Current Node synchronization check box, the current node is dry-run and retroactive instances are generated for the descendant nodes.
    Workspaces Select workspaces under Available Workspaces and add them to Selected Workspaces. Fuzzy match is supported when you search for required workspaces under Available Workspaces.
    Node Whitelist The nodes outside the selected workspaces, for which DataWorks needs to generate retroactive data.
    Note You can search for the nodes by node ID.
    Node Blacklist The nodes inside the selected workspaces, for which DataWorks does not need to generate retroactive data.
    Note You can search for the nodes by node ID.
  3. Click OK.

Generate retroactive data for a specific node in a combined node

Workflows created in DataWorks V1.0 are automatically converted to combined nodes in Operation Center of DataWorks V2.0. If you want to generate retroactive data for a specific node in a combined node, follow these steps:
  1. Go to the Operation Center page. In the left-side navigation pane, choose Cycle Task Maintenance > Cycle Task. On the page that appears, find the target node and click DAG in the Actions column.
  2. Right-click a combined node and select View Inner Nodes.
  3. On the page that appears, select the topmost node in the DAG and click the Copy icon to the right of Node ID in the lower-right corner.
  4. Return to the Cycle Task page and enter the node ID copied in the previous step to search for the node.
  5. Open the DAG of the node that is found, right-click the node name, and then choose Run > Current and Descendant Nodes Retroactively.
  6. Select the specific node for which you want to generate retroactive data in the combined node.Select a specific node
Note Currently, you can only search for an inner node based on a combined node, but not reversely.

Manage retroactive instances in the instance list

Operation Description
Filter Find required instances by setting parameters in the red box marked with 1 in the preceding figure.

You can search for instances by node name or node ID and set Retroactive Instance Name, Node Type, Owner, Run At, Data Timestamp, Region, Engine type, Engine instance, Baseline, and My Nodes to filter instances.

Note By default, the data timestamp is set to the previous day of the current day.
DAG View the DAG of the instance. You can view the running result of the instance in the DAG.
Stop Stop the instance. You can only stop instances in the Pending (Resource) or Running state. After you perform this operation, the instance enters the Failed state.
Rerun Rerun the instance.
Rerun Descendant Nodes Rerun the descendant instances of the instance.
Freeze Pause the scheduling of the instance.
Unfreeze Resume the scheduling of the instance after it is frozen.
View Lineage View the lineage of the instance.

Manage retroactive instances in a DAG

Click the name of an instance or DAG in the Actions column to view the DAG of the instance. In the DAG, you can right-click the instance to perform related operations.

Note After you click Refresh in the upper-right corner, the DAG of the instance is refreshed, but the operational logs are not.
Operation Description
Show Ancestor Nodes and Show Descendant Nodes Show ancestor or descendant instances at one or more levels. If a workflow contains three or more instances, the DAG only displays the current instance and hides its ancestor and descendant instances.
View Runtime Log View the operational logs of the instance if it is in the Running, Successful, or Failed state.
View Code View the code of the instance.
Edit Node Go to the DataStudio page to modify the node to which the instance belongs.
View Lineage View the lineage of the instance.
Stop Stop the instance. You can only stop instances in the Pending (Resource) or Running state. After you perform this operation, the instance enters the Failed state.
Rerun Rerun the instance if it is in the Failed state or an abnormal state.
Rerun Descendant Nodes Rerun all the descendant instances of the instance.
Set Status to Successful Set the status of the instance to Successful and run its descendant instances as scheduled. You can perform this operation if an instance fails.
Note This operation only applies to failed instances but not workflows.
Emergency Operations Perform these operations in emergency only. These operations only take effect for the current instance for one time.

Select Delete Dependencies to delete dependencies of the current instance. You can perform this operation so that you can start the current instance when the ancestor instances fail and the current instance does not depend on the data of the ancestor instances.

Freeze Pause the scheduling of the instance.
Unfreeze Resume the scheduling of the instance after it is frozen.

Instance statuses

No. Status Icon
1 Successful 1
2 Pending (Ancestor) 2
3 Failed 3
4 Running 4
5 Pending (Resource) 5
6 Pending (Schedule) 6