This topic describes how to create a script task by using the task orchestration feature of Data Management (DMS) together with Database Gateway.

Background information

The scripts for the following tasks are stored on servers. You can use the task orchestration feature of DMS together with Database Gateway to schedule the scripts in a centralized manner.
  • Process data by using advanced tools to produce various business models. The advanced tools include the NumPy and scikit-learn libraries for Python, and the MLlib library of Apache Spark. For example, refine data in a search or recommendation system by using advanced tools.
  • Consume data. For example, you can generate an Excel script when you read data or generate a script that is used to automatically send emails that contain the data you read.

Create a database gateway

This section describes how to create a database gateway on the server where the script you want to run resides.
Note One database gateway corresponds to one server.

For example, if you need to run scripts on three Elastic Compute Service (ECS) instances, you must create three database gateways instead of creating three nodes within one database gateway.

  1. Install the database gateway on the server where the script you want to run resides. Only the Linux operating system is supported. You are not allowed to install and start the database gateway as a root user. For more information, see the first three steps in Create a database gateway.
    Note If you install a database gateway on an ECS instance, we recommend that you select Access through Alibaba Cloud VPC internal address (ECS self-built library/leased line/CEN/VPN gateway).
  2. Create a directory named dg_scripts in the user directory where the database gateway is installed. For example, the current user is xiaoming. In this case, run the mkdir dg_scripts command in the /home/xiaoming directory to create the dg_scripts directory.
  3. Move the shell script you want to run to the dg_scripts directory. For example, the shell script is named demo.sh. In this case, run the mv <demo.sh> /home/xiaoming/dg_scripts command in the directory where the script resides.
    Note A script name can contain only letters, digits, underscores (_), and periods (.).

    In this example, the script contains the following content:

    echo helloworld
    echo {"hello": "world"}

Create a script task

This section describes how to create a script task in the DMS console.

  1. Log on to the DMS console V5.0.
    Note To switch to the previous version of the DMS console, click the 5租户头像 icon in the lower-right corner of the page. For more information, see Switch to the previous version of the DMS console.
  2. In the top navigation bar, click DTS. In the left-side navigation pane, choose Data Development > Task Orchestration.
    Note If you are using the previous version of the DMS console, move the pointer over the More icon in the top navigation bar and choose Data Factory > Task Orchestration (New).
  3. Click Create Task Flow.
    Note If you are using the previous version of the DMS console, click the Develop Space icon Develop Space icon in the left-side navigation pane. Then, click New Task Flow.
  4. In the New Task Flow dialog box, set the Task Flow Name and Description parameters and click OK.
  5. From the node list in the left-side pane, drag Script to the blank area on the canvas.
  6. Click the script task that you create and then the Rename icon icon to rename the node.
  7. Click the script task that you create. In the lower-left corner of the page, configure the database gateway for the script task.
    Parameter Description
    Region The region where the database gateway resides.
    Gateway ID The name of the database gateway.
    Note You can view the name of a database gateway on the Gateway List page in the Database Gateway console.
    Gateway ID The ID of the gateway node.
    Note You can view the ID of a gateway node on the Gateway details page of the corresponding gateway.
    File name The name of the script in the dg_scripts directory where the database gateway is installed. For example, if the storage path of the script is /home/xiaoming/dg_scripts/demo.sh, enter demo.sh.
    Runtime Parameter The variables that are used in the script. Script variables are classified into system variables, custom time variables, and output variables. You can enter or select variables of the preceding types. For more information, see Script variables.
  8. Click Try Run in the upper-left corner of the task flow editing page. The standard output of the script is displayed on the Execution Logs tab.

    The following figure shows the output of the sample script.

    Try Run
  9. Configure the script task for the task flow. For more information, see Configure a task flow.

Script variables

  • System variables.
    Variable Description Example
    sys.flow.start.timestamp The timestamp generated when the task is run. 2021-05-24T11:20:07.562+08:00
    sys.flow.start.year The year when the task is run. 2021
    sys.flow.start.month The month of the year when the task is run. 5
    sys.flow.start.day The day of the month when the task is run. 24
    sys.flow.start.hour The hour of the day when the task is run. 11
    sys.flow.start.minute The minute of the hour when the task is run. 20
    sys.flow.start.second The second of the minute when the task is run. 7
    sys.flow.start.milliseconds The millisecond of the second when the task is run. 562
    sys.flow.start.timezone The time zone. Asia/Shanghai
    sys.flow.biztime The data timestamp. By default, the data timestamp is the previous day of the day when the task is run. 1621740007562
    sys.flow.name The name of the task flow. dwd_activityDailyPV
    sys.node.name The name of the task. Single Instance SQL-1
  • Custom time variables. For more information, see Configure variables.
  • Output variables. If the last line of the script output is a JSON string, the script task parses the string to obtain a key-value pair, where the value is used as an output variable and passed to the next task.

    For example, if the content of a script is echo '{"key":"hello"}', the script task parses the script output to obtain an output variable whose name is key and value is hello. The next task can obtain the variable by using the ${key} expression.

What to do next

Publish a task flow.