edit-icon download-icon

Scheduling configuration

Last Updated: Apr 10, 2018

Currently, the time attribute of tasks supports five configuration modes: month, week, day, hour, and minute. This article describes these five configuration modes and instance operation in the scheduling system.

Note:

For a periodic task, the priority of its dependency is greater than the time attribute. When a certain time point determined by the time attribute arrives, the task instance does not start running immediately. Instead, it first checks whether all upstream depended instances have run successfully.

  • If not all upstream dependent instances have run successfully and the scheduled operation time has arrived, the instance is still in Idle state.

  • If all upstream dependent instances have run successfully and the scheduled operation time has not yet arrived, the instance enters the state of Waiting.

  • If all upstream dependent instances have run successfully and the scheduled operation time has arrived, the instance enters the Waiting state and is ready for running.

In DataWorks, after a task is successfully submitted, the underlying scheduling system generates instances on a daily basis, starting from the next day, in accordance with the time attribute of the task. Then the system runs these new instances based on the operation results of upstream dependent instances and the scheduled time points. For tasks successfully submitted after 23:30, the system starts generating instances from the third day onwards.

Note:

If a task must be run once every Monday, then the task is actually performed only on Mondays. On days other than Monday, the task is directly set to Successful without being actually run. Therefore, for weekly scheduling tasks, during test/data supplementation, you must select business date=operation time-1.

Daily scheduling tasks

Daily scheduling tasks automatically run once every day. When you create a periodic task, by default the task is set to run at 00:00 every day. You can specify the operation time point based on your needs.

SchedulingByDay

Scenarios

Import, statistics processing, and export tasks are all daily tasks. Statistics processing tasks depend on import tasks, while export tasks depend on statistics processing tasks. The scheduling system automatically generates instances for tasks and runs.

Dependency

Weekly scheduling tasks

Weekly scheduling tasks automatically run once at a specific point of time on specific days of each week. On non-designated dates, to guarantee the normal running of downstream instances, the system also generates instances but only sets them to Run Successfully without actually running any logic or occupying resources.

SchedulingByWeek

Instances generated on every Monday and Friday run normally, whereas on Tuesday, Wednesday, Thursday, Saturday, and Sunday, instances are generated and directly set to Run Successfully.

In accordance with the configuration, the scheduling system automatically generates instances for tasks and runs.

Monthly scheduling tasks

Monthly scheduling tasks automatically run once at a specific point of time on specific days of each month. On non-designated dates, to guarantee the normal running of downstream instances, the system also generates instances but the only sets them to Run Successfully without actually running any logic or occupying resources.

SchedulingByMonth

Instances generated on the 1st day of every month run normally, whereas on other days, instances are generated and directly set to Run Successfully.

In accordance with the configuration, the scheduling system automatically generates instances for tasks and runs.

Hourly scheduling tasks

An hourly scheduling task runs once at intervals of N*1 hours. For example, from 01:00 to 04:00 every day, tasks run once every hour.

SchedulingByHour

Note:

The cycle is calculated in a fully closed range. For example, if you schedule the running of a task at the interval of 1 hour from 00:00 to 03:00, the time range is [00:00, 02:59], the interval is 1 hour, and the scheduling system generates four instances every day, scheduled to run at 00:00, 01:00, and 02:00, respectively.

In accordance with the configuration, from 00:00 to 23:59 every day, a scheduling task is automatically performed once every 6 hours. Therefore, the scheduling system automatically generates instances for the task and runs.

Minutely scheduling tasks

A minutely scheduling task runs once at intervals of N*specified minutes. Currently, the minimum cycle supported is running once every 5 minutes.

SchedulingByMinutes

In accordance with the configuration, from 00:00 to 23:59 every day, a scheduling task is automatically performed once every 30 minutes. Therefore, the scheduling system automatically generates instances for the task and runs.

FAQs

What about those non-designated dates for weekly/monthly tasks?

For weekly/monthly scheduling tasks, on non-designated dates, instances are still generated but directly set to Run Successfully. In that case, no information about them is generated in the log, which is normal.

Are instances generated if a task is set to “Suspended”?

When a task is set to “Suspended”, the scheduling system still generates one or multiple instances for the task on a daily basis based on the time attribute, but when the scheduled time of these instances arrives, the code in these instances does not actually run. Instead, these instances are directly set to “Failed” to make sure that downstream instances are not triggered.

For instances that are directly set to “failed to run”, their code is not actually executed, and so no log information is generated.

If a task is deleted, are the instances affected?

If a task is deleted after running for a period, as the scheduling system generates one or multiple instances for the task every day based on the time attribute, the instances are not deleted. Therefore, when these instances are triggered for running after the task is deleted, they fail to run because the system cannot find the required code. In that case, error information is displayed as follows:

What if I want to calculate the monthly data on the last day of every month? Now, the system does not support the configuration of the last day of every month. Therefore, if you set the cycle to 31st day of every month, then in months with 31 days, the scheduling task runs normally on one day only, and on other dates, instances are generated and directly set to “successfully run”.

If you want to calculate monthly data, we recommend that you choose the 1st day of every month and calculate the data for the previous month.

Thank you! We've received your feedback.