All Products
Search
Document Center

DataWorks:Scheduling time

Last Updated:Mar 23, 2026

A scheduling period defines how often a Node runs automatically in a production environment. The DataWorks scheduling system generates recurring instances based on this period and triggers their execution according to Node dependencies and each instance's scheduled time.

Core concepts

  • Recurring instance: A recurring instance is a runtime entity that the scheduling system generates for each business date based on a task's scheduling configuration, such as running daily at 00:00. The task's execution, status, and logs are all associated with this instance.

  • Cross-period dependency: DataWorks supports dependencies between Nodes with different scheduling periods. For example, a daily downstream Node can depend on an hourly upstream Node. The dependency is established between the recurring instances they generate. For more information, see Understanding dependencies between different scheduling periods.

  • Dry-run: For tasks not scheduled to run daily, such as weekly, monthly, or yearly tasks, the scheduling system generates a dry-run instance on non-execution days. When its scheduled time is reached, this instance immediately transitions to a "Succeeded" state but does not execute the code within the Node. The primary purpose of a dry-run is to resolve dependencies, allowing downstream Nodes to run as scheduled.

    • Its status is "Succeeded", its duration is 0 seconds, and it generates no execution logs.

    • It does not consume scheduling or compute resources.

    • It does not block downstream Nodes. Even if an upstream Node performs a dry-run, downstream Nodes run as scheduled once their own conditions are met.

Basic principles and scenarios

Execution conditions

A recurring instance is executed only if both of the following conditions are met, in no particular order:

  • All its dependent upstream instances have completed successfully, including instances that succeeded via a dry-run.

  • The instance's own scheduled time has been reached.

Therefore, the configured scheduling time is only the expected scheduled time. Factors such as the completion time of upstream Nodes, the availability of compute resources, and other runtime conditions affect the Node's actual execution time.

Workflow scheduling scenarios

In a workflow consisting of Nodes A, B, and C (A→B→C), the scheduling time configuration affects the entire workflow execution:

Scenario 1: Unified start time

image

If the entire workflow must start after 3:00 AM, you only need to set the scheduled time for the root Node, A, to 03:00. Even though the downstream Nodes B and C have a default scheduled time of 00:00, they will start only after Node A successfully completes.

Scenario 2: Different start times

image

If Node A must run at 3:00 AM, Node B after 5:00 AM, and Node C after 6:00 AM, you must set their respective scheduled times to 03:00, 05:00, and 06:00.

Scenario 3: Specific start times

image

If Node A must run at 3:00 AM and Node B must run after 5:00 AM, while Node C has no specific time requirement, you must set the scheduled times for Node A to 03:00 and Node B to 05:00. Node C starts only after Node B, its upstream dependency, completes its run sometime after 5:00 AM.

Impact of updating the schedule

After you modify a Node's schedule in Scheduling and republish the Node, the impact depends on the selected Instance Generation Method:

  • Generate instance on the next day (T+1): When you select this option and publish the Node, the scheduled times of instances already generated for the current day (T) and the previous day (T-1) are updated to the new time. This affects all such instances, regardless of their status (e.g., pending, running, or completed). Future instances are generated based on the new schedule.

  • Generate instance now: This option immediately generates new instances based on the new configuration. The scheduled times of historical instances do not change.

Scheduling types

DataWorks supports minute, hourly, daily, weekly, monthly, and yearly scheduling periods. On the Node editing page in DataStudio (Data Development), click Scheduling in the right-side pane and configure the settings in the Scheduling Time section.

Note

The method for configuring the scheduling time depends on how the Node is organized:

  • Workflow Node: If a Node is part of a workflow, its schedule is configured at the workflow level. Individual Nodes within the workflow cannot have separate schedules. To make changes, go to the workflow's scheduling configuration.

  • Standalone Node: If a Node is not part of any workflow, its schedule is configured independently.

Minute scheduling

Note

The minimum interval for minute scheduling is 1 minute.

Configuration example

Objective: Schedule the Node to run every 30 minutes between 00:00 and 23:59 each day.

Note

The cron expression is automatically generated based on your selections and cannot be edited manually.

分钟调度配置示例

Scheduling details

This configuration generates 48 recurring instances per day, with scheduled times at 00:00, 00:30, 01:00, ..., 23:30. The business date for each instance is the current day.

image
Note

For more dependency scenarios related to minute scheduling, see Minute task dependencies.

Hourly scheduling

Notes

  • The time period is a closed interval [start, end]. For example, if you configure a task to run every 1 hour between 00:00 and 03:00, the scheduling system generates four instances with scheduled times at 00:00, 01:00, 02:00, and 03:00.

  • You can set a time range and interval, or specify multiple discrete run times.

Configuration example

Objective: Schedule the Node to run automatically every 6 hours between 00:00 and 23:59 each day.

Note

The cron expression is automatically generated based on your selections and cannot be edited manually.

小时调度配置示例

Scheduling details

The scheduling system generates four instances per day with scheduled times at 00:00, 06:00, 12:00, and 18:00.

image
Note

For more dependency scenarios related to hourly scheduling, see Hourly task dependencies.

Daily scheduling

A daily Node runs once per day at a specified time. When you create a new recurring task, the default scheduled time is set to a random time between 00:00 and 00:30, which you can modify as needed.

Configuration example

Objective: Schedule the Node to run once at 13:00 every day.

Note

The cron expression is automatically generated based on your selections and cannot be edited manually.

日调度配置示例

Scheduling details

The scheduling system generates one instance for this task each day, with a scheduled time of 13:00.

image
Note

For more dependency scenarios related to daily scheduling, see Daily task dependencies.

Weekly scheduling

On non-scheduled days, a weekly Node performs a dry-run to ensure that downstream dependencies can run as scheduled. For more information, see Dry-run.

Configuration example

Objective: Schedule the task to run at a specific time every Monday and Friday. Instances for Mondays and Fridays will run fully, while instances for all other days will perform a dry-run.

Note

The cron expression is automatically generated based on your selections and cannot be edited manually.

周调度配置示例

Scheduling details

The scheduling system automatically generates and runs instances for the task.

image
Note

When using the data backfill feature, carefully select the correct business date. In DataWorks, the rule is business date = scheduled date - 1 day. For example, to backfill a weekly task that runs on a Monday, you must select the preceding Sunday as the business date. If you select any other business date, the backfill instance will perform a dry-run.

Monthly scheduling

On non-scheduled days, a monthly Node performs a dry-run to ensure that downstream dependencies can run as scheduled. For more information, see Dry-run.

For monthly scheduling, you can set the Specified Time to Last Day Of Each Month.

Configuration example

Objective: Schedule the task to run at a specific time on the last day of each month. Instances for the last day of the month will run fully, while instances for all other days will perform a dry-run.

Note

The cron expression is automatically generated based on your selections and cannot be edited manually.

月调度配置示例

Scheduling details

The scheduling system automatically generates and runs instances for the task.

image
Note

When using the data backfill feature, carefully select the correct business date. In DataWorks, the rule is business date = scheduled date - 1 day. For example, to backfill a monthly task that runs on the last day of January (January 31), you must select January 30 as the business date. If you select any other business date, the backfill instance will perform a dry-run.

Yearly scheduling

On non-scheduled days, a yearly Node performs a dry-run to ensure that downstream dependencies can run as scheduled. For more information, see Dry-run.

Configuration example

Objective: Schedule the task to run on the 1st and last day of January, April, July, and October each year. Instances for these specific dates will run fully, while instances for all other days will perform a dry-run.

年调度配置示例

Scheduling details

The scheduling system automatically generates and runs instances for the task.

image