All Products
Search
Document Center

Realtime Compute for Apache Flink:Manage workflows

Last Updated:Nov 26, 2025

A workflow is a directed acyclic graph (DAG) that you can create by dragging tasks and associating the tasks with each other. If you want to run tasks at specific points in time, you can create a workflow and configure tasks and scheduling policies in the workflow. This topic describes how to create and run a workflow.

Limitations

  • You can create a workflow to schedule only tasks that are associated with batch deployments.

  • The Workflows feature is in public preview. The service level agreement (SLA) is not guaranteed in the public preview phase. For more information, see Realtime Compute for Apache Flink SLA. If you have questions when you use this feature, submit a ticket for technical support.

  • The Workflows feature is supported only in the China (Shanghai), China (Hangzhou), China (Beijing), China (Shenzhen), China (Zhangjiakou), and Singapore regions.

Create a workflow

  1. Log on to the management console of Realtime Compute for Apache Flink.

  2. Click Console in the Actions column of your workspace.

    The development console appears.

  3. In the left-side navigation pane, choose O&M > Workflows.

  4. Click Create Workflow.

  5. In the Create Workflow panel, configure the parameters:

    Parameter

    Description

    Name

    The workflow name, which must be unique in the current namespace.

    Variable Configuration

    One or more preset variables that are used for data processing.

    • Variable Name: Enter a custom variable name, such as ${date}.

    • Variable Value: Enter a static date, a time format, or an expression.

    You can configure the following system time variables:

    • Variable whose name is system.biz.date and value is ${system.biz.date}: This variable specifies the date of the day before the scheduled execution time of a daily scheduling workflow instance, in the yyyyMMdd format.

    • Variable whose name is system.biz.curdate and value is ${system.biz.curdate}: This variable specifies the scheduled execution time of a daily scheduling workflow instance, in the yyyyMMdd format.

    • Variable whose name is system.datetime and value is ${system.datetime}: This variable specifies the scheduled execution time of a daily scheduling workflow instance, in the yyyyMMddHHmmss format.

    Note
    • Skip this configuration if you are creating a workflow for materialized tables.

    • The configured variables apply to all task-associated deployments. The variables configured for a workflow take precedence over those configured for a deployment.

    Scheduling Type

    The scheduling type. Valid values:

    • Manual Scheduling: Click Execute in the Actions column to run the workflow. Manual scheduling is ideal for ad-hoc testing or processing.

    • Periodic Scheduling: The workflow runs based on the scheduling rule. The workflow can be scheduled by minute, hour, or day.

    Important

    To create a task of the materialized table type, choose Periodic Scheduling.

    Scheduling Rule

    This parameter is required only when Scheduling Type is set to Periodic Scheduling. Select Use cron expression to specify complex scheduling rules. Examples:

    • 0 0 */4 ? * *: The workflow is scheduled every four hours.

    • 0 0 2 ? * *: The workflow is scheduled at 02:00:00 every day.

    • 0 0 5,17 ? * MON-FRI: The workflow is scheduled at 05:00:00 and 17:00:00 from Monday to Friday.

    Scheduled Start Time

    The time when the scheduling rule takes effect. This parameter is required only when the Scheduling Type parameter is set to Periodic Scheduling.

    Important
    • After you create a periodic scheduling workflow, you must turn on the switch in the State column to run the workflow at the time that is specified by the Scheduled Start Time parameter.

    • To prevent job dry-run or failure, ensure you set the start time to a future time point.

    Failure Retry Times

    The number of retries for each task in the workflow. By default, a task is not retried if it fails.

    Failure Notification Email

    The email address to which notifications are sent if a task fails.

    Note

    Notifications can be sent via DingTalk or text messages. For more information, see Configure monitoring and alerting.

    Resource Queue

    The queue on which the workflow is deployed. After you configure the queue for a workflow, the tasks in the workflow are automatically deployed on the queue. Therefore, you do not need to specify a queue for the tasks in the workflow.

    Note

    The configuration of this parameter does not change the queue for existing batch deployments.

    Tags

    One or more tags of the workflow. Specify a key and a value for each tag.

  6. Click Create.

    The workflow builder appears.

  7. Configure the initial task of the workflow.

    An initial task is automatically created for the workflow. Click the initial task named untitled. In the Edit Task panel, configure the following parameters and click Save:

    Deployment

    Parameter

    Description

    Deployment

    Only batch deployments in the current namespace are displayed. Fuzzy search is supported.

    Name

    The name of the task in the current workflow.

    Upstream Tasks

    The upstream task on which the current task depends. You can select only other tasks in the current workflow.

    Note

    Skip this configuration for the initial task, which is not dependent on any other task.

    Failure Retry Times

    The number of retries for the task. By default, it inherits the workflow's failure retry count. If a retry count is explicitly for the task, it takes precedence over the workflow's setting.

    Subscription

    The state of the task to which you want to subscribe. To enable state notifications, provide an email address and configure the Strategy setting to specify if you want to be notified of task startup, failure, or both.

    Timeout(sec)

    The timeout period of the task. If the execution time of the task exceeds the value of this parameter, the task fails to run.

    Resource Queue

    The queue where the task will be deployed. By default, the task inherits the workflow's queue unless explicitly specified otherwise.

    Note

    This configuration does not affect the queues for existing batch deployments.

    Tags

    You can specify a tag key and a tag value for the task in the workflow.

    Materialized table

    Parameter

    Description

    Materialized Table

    A materialized table is selectable only if it meets the following conditions:

    • It is time-partitioned;

    • Its refresh mode is continuous;

    • It is powered by VVR 11.0 or later.

    Name

    The name of the task in the current workflow.

    Time Partition

    • Partitioned Column: The time partition column name for the materialized table.

    • Partition Format: such as yyyyMMdd.

    A self-adaptive description based is provided based on the workflow's scheduled time, the partition column, and the partition format.

    Resource Setting

    Specify available resources for scheduled data backfilling. For Parallelism, select Automatically allocate and Flink automatically sets an optimal parallelism.

    Upstream Tasks

    The upstream task on which the current task depends. You can select only other tasks in the current workflow.

    Note

    Skip this configuration for the initial task, which is not dependent on any other task.

    Failure Retry Times

    The number of retries for the task. By default, it inherits the workflow's failure retry count. If a retry count is explicitly specified for the task, it takes precedence over the workflow's setting.

    Subscription

    The state of the task to which you want to subscribe. To enable state notifications, provide an email address and configure the Strategy setting to specify if you want to be notified of task startup, failure, or both.

    Timeout(sec)

    The timeout period of the task. If the execution time of the task exceeds the value of this parameter, the task fails to run.

    Resource Queue

    The queue on which the task is deployed. If you do not configure this parameter, the queue on which the workflow is deployed is used.

    Note

    This configuration does not affect the queues for existing materialized tables.

    Tags

    You can specify a tag key and a tag value for the task in the workflow.

    Note
    • After a materialized table task is created, Realtime Compute for Apache Flink will prompt you to create a downstream task based on the lineage of this materialized table. In the dialog, select the downstream task to confirm its rapid creation.

    • The downstream task must also meet these conditions:

      • The materialized table is powered by VVR 11.0 or later;

      • It is time-partitioned;

      • The refresh mode is continuous or the freshness is less than 30 minutes.

  8. (Optional) Click Add Task in the lower part of the workflow builder page to add more tasks.

  9. Save the configurations of the workflow.

    1. Click Save in the upper-right corner of the workflow builder page.

    2. In the Save Workflow dialog box, click OK.

Run a workflow

A workflow instance is generated in the Instance History section on the Overview tab of the workflow details page each time a workflow runs.

  • Manually run a workflow for ad-hoc testing or processing

    Find the desired workflow and click Execute in the Actions column. In the Execute Manually dialog box, select Manually Triggered for Schedule Type and click OK. Each time you click Execute, the workflow runs once.

  • Run a periodic scheduling workflow

    Find the desired workflow and turn on the switch in the State column to run the desired workflow at the time specified by the Scheduled Start Time parameter.

Note

To filling in missing historical data or re-process data in specific partitions during certain time periods, use the data backilling feature.

Backfill data

Data backfilling is used to supplement or update data within a specified period of time. You can perform data backfill to handle data scenarios such as upstream retransmission of historical data, dimension table correction, and new interface addition.

Backfill data

  1. Log on to the management console of Realtime Compute for Apache Flink.

  2. Find the workspace that you want to manage and click Console in the Actions column.

  3. In the left-side navigation pane, choose O&M > Workflows.

  4. Find the desired workflow and click Execute in the Actions column.

  5. In the Execute Manually dialog box, select Backfill for Schedule Type and configure the scheduling information. The following table describes the parameters.

    11

    Parameter

    Description

    Interval

    The interval is passed to the time variable of the workflow to refresh the data generated within the specified period of time.

    Resource Queue

    The destination queue in which the data backfill task is run. Default value: default-queue.

  6. Click OK.

Manage data backfill instances

Data backfill instances can be managed in the same way as workflow instances. For more information, see Manage workflow instances and task instances. You can perform the following operations to view an example of data backfill.

Find the desired workflow and click the name of the workflow to go to the details page of the workflow.

22

On the Overview tab, view all data backfill instances of the workflow, and the execution time and the status of each data backfill instance.

View the status of a workflow

You can view the status of all workflow instances of a workflow in the Status column. For example, if a workflow runs once a day for five days, five workflow instances are generated. The Status column displays the status of each workflow instance and the number of times the workflow stays in each state.

Status

Description

Purple

Pending

Blue

Running

Green

Successful

Red

Failed

Edit a workflow

  1. Log on to the management console of Realtime Compute for Apache Flink.

  2. Find the workspace that you want to manage and click Console in the Actions column.

  3. In the left-side navigation pane, click Workflows.

  4. Find the workflow that you want to manage and click Edit Workflow in the Actions column.

    For more information about how to modify the parameters, see the Create a workflow section of this topic.

    Note

    If the switch in the State column of a workflow is turned on, you cannot edit the workflow.

References