All Products
Search
Document Center

DataWorks:Workflow

Last Updated:Jun 30, 2026

A workflow automates data processing by organizing task nodes into a visual DAG with drag-and-drop, establishing dependencies and scheduling to build reliable data pipelines.

Key concepts

A workflow is a core orchestration unit in DataWorks. It organizes task nodes (SQL, Shell, Python, data synchronization, Check) into a directed acyclic graph (DAG) with clear dependencies, enabling unified scheduling and execution. Workflows can also be combined to support complex business scenarios.

By integrating separate tasks into a structured process, workflows shift focus from individual task management to managing an entire data pipeline. Core benefits:

  • Abstract and visualize development processes
    Encapsulate dependent nodes, such as SQL and Shell tasks, into a business-oriented workflow, like one for "Daily Active User Analysis". This creates a clear DAG that clarifies the technical lineage. It also helps non-technical stakeholders understand the data flow and aligns business goals with technical implementation.

  • Atomic management for development and O&M
    A workflow serves as the smallest unit for changes and operations. It supports holistic submission, deployment, and O&M tasks like testing, rerunning, and data backfill. This approach prevents production issues that can result from partial modifications and ensures end-to-end consistency and stability.

  • Define boundaries for team collaboration
    In a multi-team environment, a workflow clarifies ownership and responsibilities. For example, the trading team can own the trading data workflow, and the product team can own the product data workflow. This enables permission isolation and issue tracking. Standardized outputs also support efficient, decoupled collaboration between upstream and downstream teams.

Workflow type comparison

DataWorks recommends two workflow types:

  • Scheduled workflow: Runs automatically on a fixed schedule (hourly, daily, weekly). Triggered by scheduling rules, with node execution controlled by the scheduled time. Suitable for recurring data processing.

  • Event-triggered workflow: Triggered on demand by external signals—manual operations, OpenAPI calls, or event messages. Does not rely on a fixed schedule. Supports manual triggering, API triggering, and event triggering. Suitable for real-time processing or responding to external events.

Feature

Scheduled workflow

Event-triggered workflow

Manual workflow (not recommended)

Scheduling method

Triggered by scheduled time and dependencies

Manual/event/API triggering

Manual execution

Use case

Daily/hourly/weekly/monthly batch processing

Real-time processing/on-demand execution/external integration

Ad hoc tasks (legacy compatibility)

Parameter priority

Node > Workflow > Workspace

Node > Workflow > Workspace

Workflow > Node

Typical example

Daily T+1 reporting at midnight

Automatic processing when an OSS file arrives

One-time data fix

Important
  • An event-triggered workflow without a bound trigger can also be used as a manually run workflow, gradually replacing manual workflows.

  • Manual workflows are mainly used for compatibility with the legacy data development mode. We do not recommend using them for new projects.

Quick selection guide

Answer the following three questions to quickly determine the right workflow type:

image

FAQ

How do I obtain the Spec template of a workflow?

Open an existing workflow on the Data Studio page, and click Show Spec in the upper-right corner of the canvas to view and copy the workflow Spec (in JSON format). You can use this Spec as a template to create or update workflows through OpenAPI.

Can a scheduled task automatically stop after it succeeds once?

No. A scheduled workflow runs continuously based on the configured schedule and does not automatically stop after a successful execution. If you want a task to run only once, use one of the following methods:

  • Manually freeze the task: After the task succeeds, manually freeze it. A frozen task no longer participates in scheduling. For more information, see Freeze a task.

  • Use an event-triggered workflow: If your business scenario requires only a single execution, an event-triggered workflow is more appropriate. An event-triggered workflow is not bound to a schedule and runs only when manually triggered, called through an API, or triggered by an event, which inherently meets the need to run once and stop.

References

Select the relevant document based on your scenario: