When you want a node to immediately generate scheduling instances and automatically execute according to its configuration after being deployed to the production environment, you can configure the instance generation mode of the node to Immediately After Deployment.
Background
After a node is deployed, you can view the latest task configuration in Operation Center. DataWorks generates auto triggered instances for the next day based on the node configuration of auto triggered tasks every night. However, when a node is newly created or updated and deployed to the production environment on the current day, the time when the auto triggered instances take effect or when the dependencies are updated depends on the instance generation mode you choose.
In DataWorks, you can control whether the instances take effect immediately by using two options for the instance generation mode: Next Day and Immediately After Deployment.
Next Day: Node creation and update operations take effect on the auto triggered instances of the next day. If you need to execute a task immediately after it is deployed to the production environment, you can perform a data backfill operation for the task.
Immediately After Deployment: Node creation and update operations take effect immediately, but there is a time difference between when the node is deployed and when runnable instances are actually generated. This has different impacts on different task execution scenarios. For more information, see Common scenarios for immediate instance generation.
Precautions
Nodes within a workflow do not support individual configuration of immediate instance generation after deployment. This option can only be configured for the entire workflow in the scheduling configuration on the workflow editing page.
Regardless of whether you choose Next Day or Immediately After Deployment for instance generation in the scheduling configuration, the period from 23:30 to 24:00 every day is reserved for generating all auto triggered instances for the next day. Tasks deployed during this time period will need to wait until the third day to generate corresponding instances.
Inconsistent instance generation modes for upstream tasks may lead to isolated nodes.
Time difference for immediate instance generation: To prevent instance changes from causing task execution exceptions, there is a 10-minute time difference for immediate instance generation after deployment. This means that the scheduled time must be at least 10 minutes later than the deployment time for the task to run according to the latest configuration.
Effective scope of immediate instance generation: Not all changes take effect immediately. For example, if you modify the data source instance associated with a node, then configure immediate instance generation and deploy it, this will not affect existing instances on the current day. The auto triggered instances on the current day will still execute using the data source instance from before the change.
NoteYou can perform a data backfill operation on the auto triggered task with the latest configuration. The data backfill will be executed according to the latest task configuration.
Immediate instance generation description
Immediate instance generation after deployment is only applicable for scheduled tasks in future time periods. Specifically, instances will only execute normally when the scheduling time of the task is later than the deployment time.
Auto triggered instances are generated on the day a new task is created, but only instances with scheduled times in future time periods will execute normally.
When updating a node's scheduling time, if the scheduled time is in the past, no instances will be generated. If the scheduled time is in a future time period, new instances will be generated according to the new configuration, replacing instances from before the update.
NoteThe scheduled time must be at least 10 minutes after the node deployment time for immediate instance generation to work properly.
Instance scheduled time in normal execution period
Scenario 1: When creating a new node that generates actual running auto triggered instances on the same day, if the scheduled time of the instance is in the future relative to the time when the node is deployed and the instance is generated, and the difference between these two time points is greater than 10 minutes, the instance will be scheduled normally. For more information, see: Configure the immediate instance generation feature for a newly created node.
Scenario 2: After updating a node's configuration, if the scheduled time of the instance is in the future relative to the time when the node is deployed and the instance is generated, and the difference between these two time points is greater than 10 minutes, the instance will be scheduled normally. The scheduled instance will be the updated instance. For more information, see: Update the scheduling cycle of a deployed task.
Scenario 4: Impact on downstream dependencies when changing a task's scheduled time.
If the scheduled time of the changed instance is in the future relative to the time when the node is deployed and the instance is generated, and the difference between these two time points is greater than 10 minutes, the instance will be scheduled normally. Downstream instances that have not run will depend on the new instances after the change. For more information, see: Impact on downstream dependencies when changing a task's scheduling time.
If the scheduled time of the changed instance is in the past relative to the time when the node is deployed and the instance is generated, dry-run instances will be generated. Downstream instances that have not run will become isolated nodes. For more information, see: Inconsistent instance generation modes between upstream and downstream tasks.
We do not recommend using this feature when modifying the scheduling configuration of production nodes. This feature may cause dependency changes, dependency confusion, instance replacement, instance deletion, and other issues, making the dependencies on the current day complex. However, task dependencies will return to normal on the second day.
Instance scheduled time in dry-run period
If the scheduled time is in the past relative to the node deployment time, auto triggered instances will still be generated, but the instances will dry-run. The instance status will be expired instance that is generated in real time, and no actual code logic will be executed. For more information, see: Configure the immediate instance generation feature for a newly created node.
Scenario 1: The scheduled time of the instance is in the future relative to the time when the node is deployed and the instance is generated, but the difference between these two time points is less than 10 minutes. The instance status will be Expired Instance That Is Generated In Real Time.
For example: Node A has a scheduled time of
09:05, and the node deployment time is09:00. If the scheduled time of the instance is in the future relative to the time when the node is deployed and the instance is generated, but the time difference between them is less than 10 minutes, Node A will generate a dry-run instance with the status Expired Instance That Is Generated In Real Time.Scenario 2: The scheduled time of the instance is in the past relative to the time when the node is deployed and the instance is generated, and an Expired Instance That Is Generated In Real Time is immediately generated.
For example: Node A has a scheduled time of
09:00and a deployment time of10:00, so the instance generation time is before the deployment time. Node A will immediately generate a dry-run instance with the status Expired Instance That Is Generated In Real Time.
Common scenarios for immediate instance generation
When using the Immediately After Deployment mode to generate instances, the instance execution and upstream/downstream dependency situations for related functional scenarios are as follows: