A scheduling period defines how often a Node runs automatically in a production environment. The DataWorks scheduling system generates recurring instances based on this period and triggers their execution according to Node dependencies and each instance's scheduled time.
Core concepts
Recurring instance: A recurring instance is a runtime entity that the scheduling system generates for each business date based on a task's scheduling configuration, such as running daily at 00:00. The task's execution, status, and logs are all associated with this instance.
Cross-period dependency: DataWorks supports dependencies between Nodes with different scheduling periods. For example, a daily downstream Node can depend on an hourly upstream Node. The dependency is established between the recurring instances they generate. For more information, see Understanding dependencies between different scheduling periods.
Dry-run: For tasks not scheduled to run daily, such as weekly, monthly, or yearly tasks, the scheduling system generates a dry-run instance on non-execution days. When its scheduled time is reached, this instance immediately transitions to a "Succeeded" state but does not execute the code within the Node. The primary purpose of a dry-run is to resolve dependencies, allowing downstream Nodes to run as scheduled.
Its status is "Succeeded", its duration is 0 seconds, and it generates no execution logs.
It does not consume scheduling or compute resources.
It does not block downstream Nodes. Even if an upstream Node performs a dry-run, downstream Nodes run as scheduled once their own conditions are met.
Basic principles and scenarios
Execution conditions
A recurring instance is executed only if both of the following conditions are met, in no particular order:
All its dependent upstream instances have completed successfully, including instances that succeeded via a dry-run.
The instance's own scheduled time has been reached.
Therefore, the configured scheduling time is only the expected scheduled time. Factors such as the completion time of upstream Nodes, the availability of compute resources, and other runtime conditions affect the Node's actual execution time.
Workflow scheduling scenarios
In a workflow consisting of Nodes A, B, and C (A→B→C), the scheduling time configuration affects the entire workflow execution:
Scenario 1: Unified start time | |
If the entire workflow must start after 3:00 AM, you only need to set the scheduled time for the root Node, A, to |
Scenario 2: Different start times | |
If Node A must run at 3:00 AM, Node B after 5:00 AM, and Node C after 6:00 AM, you must set their respective scheduled times to | |
Scenario 3: Specific start times | |
If Node A must run at 3:00 AM and Node B must run after 5:00 AM, while Node C has no specific time requirement, you must set the scheduled times for Node A to | |
Impact of updating the schedule
After you modify a Node's schedule in Scheduling and republish the Node, the impact depends on the selected Instance Generation Method:
Generate instance on the next day (T+1): When you select this option and publish the Node, the scheduled times of instances already generated for the current day (T) and the previous day (T-1) are updated to the new time. This affects all such instances, regardless of their status (e.g., pending, running, or completed). Future instances are generated based on the new schedule.
Generate instance now: This option immediately generates new instances based on the new configuration. The scheduled times of historical instances do not change.
Scheduling types
DataWorks supports minute, hourly, daily, weekly, monthly, and yearly scheduling periods. On the Node editing page in DataStudio (Data Development), click Scheduling in the right-side pane and configure the settings in the Scheduling Time section.
The method for configuring the scheduling time depends on how the Node is organized:
Workflow Node: If a Node is part of a workflow, its schedule is configured at the workflow level. Individual Nodes within the workflow cannot have separate schedules. To make changes, go to the workflow's scheduling configuration.
Standalone Node: If a Node is not part of any workflow, its schedule is configured independently.
Minute scheduling
The minimum interval for minute scheduling is 1 minute.
Configuration example
Objective: Schedule the Node to run every 30 minutes between 00:00 and 23:59 each day.
The cron expression is automatically generated based on your selections and cannot be edited manually.

Scheduling details
This configuration generates 48 recurring instances per day, with scheduled times at 00:00, 00:30, 01:00, ..., 23:30. The business date for each instance is the current day.
For more dependency scenarios related to minute scheduling, see Minute task dependencies.
Hourly scheduling
Notes
The time period is a closed interval [start, end]. For example, if you configure a task to run every 1 hour between
00:00and03:00, the scheduling system generates four instances with scheduled times at00:00,01:00,02:00, and03:00.You can set a time range and interval, or specify multiple discrete run times.
Configuration example
Objective: Schedule the Node to run automatically every 6 hours between 00:00 and 23:59 each day.
The cron expression is automatically generated based on your selections and cannot be edited manually.

Scheduling details
The scheduling system generates four instances per day with scheduled times at 00:00, 06:00, 12:00, and 18:00.
For more dependency scenarios related to hourly scheduling, see Hourly task dependencies.
Daily scheduling
A daily Node runs once per day at a specified time. When you create a new recurring task, the default scheduled time is set to a random time between 00:00 and 00:30, which you can modify as needed.
Configuration example
Objective: Schedule the Node to run once at 13:00 every day.
The cron expression is automatically generated based on your selections and cannot be edited manually.

Scheduling details
The scheduling system generates one instance for this task each day, with a scheduled time of 13:00.
For more dependency scenarios related to daily scheduling, see Daily task dependencies.
Weekly scheduling
On non-scheduled days, a weekly Node performs a dry-run to ensure that downstream dependencies can run as scheduled. For more information, see Dry-run.
Configuration example
Objective: Schedule the task to run at a specific time every Monday and Friday. Instances for Mondays and Fridays will run fully, while instances for all other days will perform a dry-run.
The cron expression is automatically generated based on your selections and cannot be edited manually.

Scheduling details
The scheduling system automatically generates and runs instances for the task.
When using the data backfill feature, carefully select the correct business date. In DataWorks, the rule is business date = scheduled date - 1 day. For example, to backfill a weekly task that runs on a Monday, you must select the preceding Sunday as the business date. If you select any other business date, the backfill instance will perform a dry-run.
Monthly scheduling
On non-scheduled days, a monthly Node performs a dry-run to ensure that downstream dependencies can run as scheduled. For more information, see Dry-run.
For monthly scheduling, you can set the Specified Time to Last Day Of Each Month.
Configuration example
Objective: Schedule the task to run at a specific time on the last day of each month. Instances for the last day of the month will run fully, while instances for all other days will perform a dry-run.
The cron expression is automatically generated based on your selections and cannot be edited manually.

Scheduling details
The scheduling system automatically generates and runs instances for the task.
When using the data backfill feature, carefully select the correct business date. In DataWorks, the rule is business date = scheduled date - 1 day. For example, to backfill a monthly task that runs on the last day of January (January 31), you must select January 30 as the business date. If you select any other business date, the backfill instance will perform a dry-run.
Yearly scheduling
On non-scheduled days, a yearly Node performs a dry-run to ensure that downstream dependencies can run as scheduled. For more information, see Dry-run.
Configuration example
Objective: Schedule the task to run on the 1st and last day of January, April, July, and October each year. Instances for these specific dates will run fully, while instances for all other days will perform a dry-run.

Scheduling details
The scheduling system automatically generates and runs instances for the task.
For more dependency scenarios, see Principles and examples for configuring complex dependency scheduling.