All Products
Search
Document Center

Dataphin:Managing integration and computing tasks

Last Updated:Jul 07, 2025

The Integration and Computing Tasks page includes computing tasks, sync tasks, and integration tasks. Each task corresponds to a scheduling node. This topic describes how to view and manage integration and computing tasks from a node perspective.

Access the Integration and Computing Tasks page

  1. In the top navigation bar of the Dataphin homepage, choose Development > O&M Center.

  2. In the left-side navigation pane, choose O&M Center > Recurring Task.

  3. In the top navigation bar, select the production or development environment.

  4. On the Recurring Task page, click the Integration And Computing Tasks tab.

Operations supported on the Integration and Computing Tasks list

After integration and computing tasks are submitted to the Operation Center for scheduling, they are displayed in the Recurring Task > Integration and Computing Tasks list. The list displays the current task object, scheduling cycle, priority, owner, related baseline, project, HTTP path, schedule resource group, last update time, and supported operations.

  • Task Object: The script recurring task submitted to the Operation Center. The script name and script ID are displayed, and the scheduling method of the task is identified. For more information, see Description of scheduling methods.

  • Recurrence: The time of task recurring schedule in the scheduling time zone.

  • Priority: The priority of the task. If the baseline feature is enabled, the baseline task takes the highest priority among all baselines as its priority, overriding the originally configured task priority.

  • Related Baseline: Displays the baseline to which the task belongs as an end node, along with related baselines that include the task as an ancestor node within their protection scope.

    Note

    If the baseline feature is not enabled, this field information is not displayed.

  • Project: The project to which the task belongs, displayed in the format Project English name (Project Chinese name).

  • HTTP path: Based on the selected production or development environment, displays the production environment HTTP path or development environment HTTP path for DATABRICKS_SQL type tasks.

    Note

    Only DATABRICKS_SQL type tasks support displaying this item. Other types of tasks display -.

  • Resource Group: The name of the scheduling resource group used by the instance when the task runs.

    If the custom resource group specified for the task is not active, it will be automatically replaced with the project default resource group. If the project default resource group is not active, it will be automatically replaced with the tenant default resource group. The priority is Custom resource group > Project default resource group > Tenant default resource group.

    Note

    When changing the project default resource group, there may be a delay in the display update here, but the actual execution will use the modified resource group.

    Tenant default resource group: It does not belong to any project. A tenant has only one default resource group. When a task does not specify a separate custom resource group or the project does not specify a project default resource group, the tenant default resource group is used for scheduling, only for exclusive resource tasks (except SQL tasks, Virtual tasks, etc.).

The following operations are supported on the Integration and Computing Tasks list.

Operation

Description

DAG Graph

Click image to view the DAG graph of the integration and computing task. For more information, see Operations supported on Integration and Computing Tasks DAG nodes.

View Recurring Instances

View the recurring instances generated by the task. You can also perform operations and management on the recurring instances.

Edit Development Node

Go to the editing page of the task in the Dev project to edit the task.

Note

This is only applicable to the Dev-Prod development mode.

Edit Node

Go to the editing page of the task to edit the task.

Note

This is only applicable to the Basic mode.

View Production Node

Go to the Prod project to view the task configuration.

Note

This feature is not supported for tasks in Basic mode and Dev-Prod development mode that have not been published to the production environment.

View Node Code

View the code written for the integration and computing task node.

View Data Backfill Instances

View and manage data backfill instances generated by data backfill operations.

Data Backfill

The data backfill feature for recurring tasks is used to refresh data for recurring tasks within specified historical data timestamps. After a recurring task is developed and submitted for publication, the task runs regularly according to the scheduling configuration. If you want to run a recurring task at a specified time period or refresh data for a historical time range, you can use the data backfill feature. For details on performing data backfill operations on integration and computing task nodes, see Appendix: Data backfill for recurring tasks.

Modify Owner

Modify the owner of the task.

Note

This is only applicable to Basic mode and the Prod environment in Dev-Prod mode. The Dev environment does not support modification.

Modify Priority

Modify the priority of the task. At the same time, among all tasks that meet the scheduling conditions, those with higher priority run first.

Note
  • This is only applicable to Basic mode and the Prod environment in Dev-Prod mode. The Dev environment does not support modification.

  • If the baseline feature is enabled, task priority can only be configured as Lowest, Low, or Medium. Higher priorities need to be configured through baselines.

  • Baseline tasks do not support modification. The priority is based on the baseline priority. Please adjust through the baseline.

  • If your compute engine type is MaxCompute, the correspondence between Dataphin task priorities and MaxCompute job priorities is Lowest (9), Low (7), Medium (5), High (3), Highest (1). For more information about MaxCompute job priorities, see MaxCompute job priorities.

  • The priority of Spark SQL tasks is effective only when the HDFS of the Hadoop compute source has set different task priority queues.

Pause

Set the current task node to the paused scheduling state. Pausing is applicable to scenarios where some tasks and their downstream tasks do not need to run temporarily but will continue to be used later, such as temporarily adjusting some calculation criteria to prevent affecting downstream data.

Resume

Set a paused node to the normal scheduling state.

Configure Monitoring and Alerting

Configure monitoring rules for task execution. For details, see Overview of offline task monitoring.

Note

This is only applicable to Prod and Basic projects.

Modify HTTP path

Modify the production environment HTTP path for the corresponding task. You can select from all HTTP paths configured for the cluster corresponding to the production project.

Note

This operation is only supported in the production environment.

Modify Resource Group

Modify the scheduling resource group used by instances generated from the task when they run.

Note
  • When selecting tasks from multiple projects, the target scheduling resource group only lists the scheduling resource groups that are authorized for all these projects. It is recommended to filter a single project first, and then perform batch settings.

  • The modification does not affect instances that have already been generated, only new instances. If you need to modify the resource group used by already generated instances, you can do so in the instance list.

Operations supported on Integration and Computing Tasks DAG nodes

The DAG graph clearly shows the upstream and downstream dependencies of task nodes, and the system supports operations and management of upstream and downstream nodes. By default, the DAG graph displays the Main node (selected node) and the first layer of upstream and downstream nodes. When you select an integration and computing task node, you can perform related operations on the task.

Dataphin supports operations and management of cross-project nodes. To perform operations on cross-project script nodes, you need to have view and operation permissions for the project to which the task belongs.

image.png

  • Operations supported on the DAG graph

    Expand Parent Nodes, Expand Child Nodes: Expand dependency nodes at different levels of the Main node in the DAG graph.

  • Operations supported on DAG nodes

    The operations supported on Integration and Computing Tasks DAG nodes are the same as those supported on the Integration and Computing Tasks list. For more information, see Operations supported on the Integration and Computing Tasks list.

Batch operations supported on Integration and Computing Tasks

The following batch operations are supported for recurring integration and computing tasks:

Operation

Description

Pause

Pause all selected tasks. After pausing, the paused tasks still generate instances normally, but the instances generated by the current task and downstream dependent instances are not scheduled.

Resume

Resume scheduling for selected tasks.

Modify Owner

Batch modify the owners of recurring integration and computing tasks.

Note

This is only applicable to Basic mode and the Prod environment in Dev-Prod mode. The Dev environment does not support modification.

Modify Priority

Batch modify the priorities of recurring integration and computing tasks.

Note
  • This is only applicable to Basic mode and the Prod environment in Dev-Prod mode. The Dev environment does not support modification.

  • If the baseline feature is enabled, task priority can only be configured as Lowest, Low, or Medium. Higher priorities should be configured through baselines.

  • Baseline tasks do not support modification. The priority is based on the baseline priority. Please adjust through the baseline.

  • If your compute engine type is MaxCompute, the correspondence between Dataphin task priorities and MaxCompute job priorities is Lowest (9), Low (7), Medium (5), High (3), Highest (1). For more information about MaxCompute job priorities, see MaxCompute job priorities.

  • The priority of Spark SQL tasks is effective only when the HDFS of the Hadoop compute source has set different task priority queues.

Modify HTTP Path

Modify the production environment HTTP path for multiple DATABRICKS_SQL tasks. If the selected DATABRICKS_SQL tasks belong to different Databricks clusters, you can specify HTTP paths for different clusters separately. You can select from all HTTP paths configured for the corresponding cluster.

Note

This operation is only supported when DATABRICKS_SQL tasks are selected in the production environment.

Modify Resource Group

Batch modify the scheduling resource groups used by instances generated from multiple tasks when they run.

Note
  • When selecting tasks from multiple projects, the target scheduling resource group only lists the scheduling resource groups that are authorized for all these projects. It is recommended to filter a single project first, and then perform batch settings.

  • The modification does not affect instances that have already been generated, only new instances. If you need to modify the resource group used by already generated instances, you can do so in the instance list.