This topic describes how to configure recurrence and dependencies for a node in DataWorks. In the following example, the synchronization node write_result that is scheduled by week is used.
Configure recurrence for the synchronization node
- Go to the DataStudio page.
- Log on to the DataWorks console.
- In the left-side navigation pane, click Workspaces.
- In the top navigation bar, select the region in which the workspace that you want to manage resides. Find the workspace and click DataStudio in the Actions column.
- Find the workflow to which the synchronization node write_result belongs and double-click the synchronization node.
- On the node configuration tab, click Properties in the right-side navigation pane. Note In a manually triggered workflow, all nodes must be manually triggered, and cannot be automatically scheduled by DataWorks.
- In the Schedule section of the Properties tab, set the parameters as required.
Parameter Description Instance Generation Mode The time to generate the first instance. Valid values: Next Day and Immediately After Deployment. For more information, see Configure immediate instance generation for a node. Recurrence
- The mode in which the node is run. Valid values: Normal: If you set the Recurrence parameter to Normal, the node is run and generates data based on the setting of the Scheduling Cycle parameter.
- Skip Execution: If you set the Recurrence parameter to Skip Execution, the node is scheduled based on the recurrence and the scheduled time that you specify. However, the status of the node becomes Freeze and no data is generated for this node.
- Dry Run: If you set the Recurrence parameter to Dry Run, the node is run based on the setting of the Scheduling Cycle parameter. However, the node performs a dry run and no data is generated.
Scheduling Cycle The recurrence of the node. Valid values: Minute, Hour, Day, Week, Month, and Year. In this example, this parameter is set to Week, the Run Every parameter to Monday and Tuesday, and the Run At parameter to
00:00. In this case, the synchronization node is scheduled to run at 00:00 on every Monday and Tuesday.
Cron Expression The CRON expression of the scheduling time you specified. The value cannot be modified. Timeout Definition The timeout period. If the node is not complete within the timeout period specified by the Timeout Definition parameter, the node automatically stops and is not rerun.
- The timeout period applies to auto triggered node instances, data backfill instances, and test node instances.
- The default timeout period ranges from 72 hours to 168 hours. The system automatically adjusts the default timeout period for a node based on system loads.
- You can specify a custom timeout period after you select Instances of Custom Nodes. For nodes scheduled by exclusive resource groups for scheduling, valid timeout period is 1 to 72 hours. For nodes scheduled by shared resource groups for scheduling, valid timeout period is 1 to 168 hours.
- If a node times out, the traffic and computing resources that have been consumed by the node are still billed.
Rerun Specifies whether to allow the node to be rerun. Valid values: Allow Regardless of Running Status, Allow upon Failure Only, and Disallow Regardless of Running Status. Auto Rerun upon Error Specifies whether to automatically rerun the node if an error occurs. This parameter appears only if you set the Rerun parameter to Allow Regardless of Running Status or Allow upon Failure Only. After you select this check box, the node is automatically rerun if an error occurs. This parameter does not appear if you set the Rerun parameter to Disallow Regardless of Running Status. In this case, the node is not rerun if an error occurs. Number of Reruns The maximum number of reruns allowed. This parameter appears only if you select Auto Rerun upon Error. Rerun Interval The intervals at which the node is rerun after an error occurs. This parameter appears only if you select Auto Rerun upon Error. You can set this parameter based on your requirements. Valid values: 1 to 30. Default value: 30. Unit: minutes. Validity Period The validity period of the node. Specify the start and end dates of the validity period as required.
Configure dependencies for the synchronization node
After you configure recurrence for the synchronization node write_result, you can continue to configure dependencies for the synchronization node.
You can configure the parent node on which the synchronization node depends. After that, the scheduling system triggers the synchronization node only after the instance of the parent node is run.
For example, the instance of the synchronization node is not triggered until the instance of its parent node insert_data is run.
By default, the scheduling system creates a node named in the format of Workspace name_root for each workspace as the root node. If no parent node is configured for the synchronization node, the synchronization node depends on the root node.
Commit and deploy the node
- On the configuration tab of the write_result node, click the icon in the toolbar.
- Commit the node. Notice You can commit the node only after you set the Rerun and Parent Nodes parameters.
If you use a workspace in standard mode, the node is committed to the development environment after you click OK. If you want to deploy the node to the production environment, click Deploy in the upper-right corner of the toolbar. For more information, see Deploy nodes.A node must be committed to the scheduling system so that the scheduling system can automatically generate and run instances for the node. The scheduling system runs these instances at the specified time from the next day based on the recurrence settings.Note If you commit a node after 23:30, the scheduling system automatically generates and runs instances for the node from the third day.
- Click the icon in the top toolbar.
- In the Commit Node dialog box, enter your comments in the Change description field.
- Click OK.
What to do next
Now you have learned how to configure recurrence and dependencies for a synchronization node. You can proceed with the next tutorial. In the next tutorial, you will learn how to perform O&M operations on the committed node and troubleshoot errors based on the operational logs. For more information, see Run a node and troubleshoot errors.