This topic describes how to use the task orchestration feature of Data Management (DMS) to efficiently schedule events.

Background information

Conventional event schedulers, such as MySQL Event Scheduler, provide powerful capabilities in scheduling database events. However, these event schedulers have the following limits:
  • High technical requirements: You must be familiar with the syntax of specific SQL statements that are used to define events, such as CREATE EVENT and ALTER EVENT. You cannot define events on a graphical user interface (GUI).
  • Dependent on the database kernel: The kernel must support the event scheduler and the event scheduler must be enabled.
  • Applicable only to a single database: You can schedule events only for a single database. You cannot use a conventional event scheduler to schedule events across databases or use the scheduler with other tools.
  • Difficult to track: You cannot view the execution status, history, and duration of a scheduled event when you use a conventional event scheduler.
  • Difficult O&M: When you use a conventional event scheduler, you cannot pause or rerun an event, and cannot restore an event after the event fails to execute.
  • Incapable of sending notifications: Conventional event schedulers cannot send text messages or emails to inform you whether an event is executed or fails to execute.
To resolve these issues, we recommend that you use the task orchestration feature of DMS. This feature provides you with an external orchestration and scheduling system that is independent of the database kernel. The task orchestration feature provides you with the following benefits:
  • The task orchestration feature supports more than 10 types of database engines and a variety of database tools. For example, the database engines include MySQL, Oracle, PostgreSQL, and SQL Server. The database tools provide features such as data migration, data backup, and cross-database analysis. This way, you can schedule events across databases and services with ease.
  • You can create complex scheduling tasks by dragging nodes and performing simple configurations on a GUI.
  • The task orchestration feature provides a variety of notification channels, such as emails, text messages, and DingTalk notifications.
  • You can perform various O&M operations, such as Pause, Exits, and Rerun.

Procedure

In this example, the task orchestration feature of DMS is used to create the event_click_archive event for the following purpose: Transfer the data that is stored in the user_click table two days before to the click_history table every day at 00:00 starting from December 1, 2020. Delete the transferred data from the user_click table so that the user_click table stores only the data of the previous day.

The following SQL statements define the event:

DELIMITER |

CREATE EVENT IF NOT EXISTS event_click_archive
ON SCHEDULE EVERY 1 DAY STARTS '2020-12-01 00:00.00'
ON COMPLETION PRESERVE
COMMENT 'Periodically archive data in the user_click table.'
DO
  BEGIN
    INSERT INTO click_history SELECT * FROM user_click
      WHERE create_time < DATE_SUB(CURRENT_TIMESTAMP, INTERVAL 1 DAY);
    DELETE FROM user_click WHERE create_time < DATE_SUB(CURRENT_TIMESTAMP, INTERVAL 1 DAY);
  END |

DELIMITER ;
  1. Log on to the DMS console V5.0.
  2. In the top navigation bar, click DTS. In the left-side navigation pane, choose Data Development > Task Orchestration.
  3. On the page that appears, click Create Task Flow.
  4. In the Create Task Flow dialog box, set the Task Flow Name parameter to df_archive_click and the Description parameter to test. Then, click OK. The Task Orchestration tab appears.
    Create Task Flow
  5. From the Task Type panel on the left side of the Task Orchestration tab, drag the Single Instance SQL node under Database Processing Nodes to the canvas.
  6. Double-click the Single Instance SQL node on the canvas and configure the following parameters.
    1. Select the database that you want to manage from the Database drop-down list.
    2. Enter the following SQL statements:
      INSERT INTO click_history SELECT * FROM user_click
            WHERE create_time < DATE_SUB(CURRENT_TIMESTAMP, INTERVAL 1 DAY);
            
      DELETE FROM user_click WHERE create_time < DATE_SUB(CURRENT_TIMESTAMP, INTERVAL 1 DAY);
      Note The task orchestration feature does not support local variables. If local variables are declared in the definition of an event, you must replace the local variables with user-defined variables when you use the task orchestration feature to create the event. You must also use a SET statement to assign values to the user-defined variables. In the following example, a local variable is replaced with a user-defined variable:

      The following SQL statements declare the local variable:

      DECLARE n_total BIGINT DEFAULT 0;
      SELECT COUNT(*) INTO n_total FROM test_table;

      The following SQL statement assigns a value to the user-defined variable:

      SET @n_total := (SELECT COUNT(*) FROM test_table);
    3. After you complete the configuration, Click Save.
      Note If you select Automatic Save, DMS automatically saves the changes during the configuration process, and the Save button is unavailable.
  7. Click Try Run in the upper-left corner to test the task.
    • If status SUCCEEDED appears in the last line of the logs, the test run is successful.
    • If status FAILED appears in the last line of the logs, the test run fails. In this case, view the node on which the failure occurs and the reason for the failure in the logs. Then, modify the configuration of the node and try again.
  8. Configure and view information about the task flow below the task flow canvas.
    1. Click the Task Flow Information tab in the lower part of the page. Set the Task Flow Name, Description, Owner, Stakeholders, Error Handling Policy, and Concurrency Control Policy parameters in the Properties section.
    2. In the Scheduling Settings section of the Task Flow Information tab, turn on Enable Scheduling and configure the parameters as required.
      Table 1. Parameters for task scheduling
      Parameter Description
      Scheduling Type The scheduling type of the task flow. Cyclic scheduling is selected in this example. Valid values:
      • Cyclic scheduling: The task flow is periodically scheduled. For example, the task flow is run once a week.
      • Schedule once: The task flow is run once at a specific point in time. You need to specify only the point in time when the task flow is run.
      Effective Time The period during which the scheduling parameters take effect. The default time period is from January 1, 1970 to January 1, 9999, which indicates that the scheduling parameters permanently take effect.

      In this example, this parameter is set to December 1, 2020 to January 1, 9999.

      Scheduling Cycle The scheduling cycle of the task flow. Day is selected in this example. Valid values:
      • Hour: The task flow is run within the hours that you specify. If you select this value, you must set the Timed Scheduling parameter.
      • Day: The task flow is run at the specified point in time every day. If you select this value, you must specify the Specific Point in Time parameter.
      • Week: The task flow is run at the specified point in time on the days that you select every week. If you select this value, you must specify the Specified Time and Specific Point in Time parameters.
      • Month: The task flow is run at the specified point in time on the days that you select every month. If you select this value, you must specify the Specified Time and Specific Point in Time parameters.
      Specific Point in Time The point in time of the specified days at which the task flow is run.

      In this example, this parameter is set to 00:00. DMS runs the task at 00:00 every day starting from December 1, 2020.

      Cron Expression The CRON expression that is automatically generated based on the specified scheduling cycle and time settings.
    3. If you want to receive notifications about the execution status of the task flow, turn on the following switches on the Notification Configurations tab as needed:
      • Success Notification: You are notified if the task flow is run as expected.
      • Failure Notification: You are notified if the task flow fails.
      • Timeout Notification: You are notified if the execution of the task flow times out.
  9. Publish the task flow. For more information about how to publish task flows, see Publish a task flow.

View the execution status of a task flow

  1. Log on to the DMS console V5.0.
  2. In the top navigation bar, click DTS. In the left-side navigation pane, choose Data Development > Task Orchestration.
  3. Click the name of the task flow that you want to view to go to the details page of the task flow.
  4. Click Go to O&M in the upper-right corner of the canvas to view information on the task flow O&M page.
    • View basic information about the task flow. You can view the Created At, Changed At, Scheduling Settings, and Released parameters of the task flow.
    • Click the Running History tab. On the tab that appears, click Scheduling Trigger or Triggered Manually to view the records that are generated each time the task flow is triggered.
      Note
      • Scheduling Trigger: The task flow is automatically run at a specified point in time based on the scheduling parameters that you configure.
      • Triggered Manually: The task flow is run after you click Try Run.
      • Find the record that you want to view and click the Plus icon icon next to the Status column to view the detailed operation logs of the task flow.
      • Find the record that you want to view and click Executive History in the Operating column. In the panel that appears, you can view the Operating Time, Operator, and Content parameters of the task flow.
      • Find a record of the task flow that you want to manage and click Exits, Rerun, Pause, Restore, or Set Successfully in the Operating column to manage the task flow based on its status.
        Note
        • You can rerun a task flow that is run as expected.
        • You can set the status of a failed task flow to Success.
        • You can terminate or pause a task flow that is running.
    • Click the Published Tasks tab. On the tab that appears, you can view the Version ID, Published By, and Published At parameters of the task flow. You can also click Details and DAG to view the details and directed acyclic graph (DAG) of the task flow.

      Select two records that have different Version ID and click Version Comparison to compare the differences between two versions of the task flow.