An FTP Check node can be used to periodically detect whether a specific file exists based on File Transfer Protocol (FTP). If the FTP Check node detects that the file exists, the scheduling system starts to run the descendant node of the FTP Check node. Otherwise, the FTP Check node retries the detection based on the configured detection interval. The FTP Check node stops the retry until the condition for stopping the detection is met. In most cases, FTP Check nodes are used for communications between the DataWorks scheduling system and external scheduling systems. This topic describes how to use an FTP Check node and the related precautions.
Prerequisites
- An FTP data source is added. For more information, see Add an FTP data source.
- An exclusive resource group for scheduling is created. For more information, see Billing of exclusive resource groups for scheduling (subscription).
- A workflow is created. For more information, see Create an auto triggered workflow.
Background information
An FTP Check node is typically used in the following scenario: A node in the DataWorks scheduling system needs to access an external database in an external scheduling system, but an ongoing data write task for the database is not performed by DataWorks. In this case, the time when the data write task is completed and the time when the database can be accessed are unknown to DataWorks. If the node accesses the database, the data that is read from the database may be incomplete or the data read fails because the data write task is not completed. To ensure that the node can successfully read data from the external database, you can enable the external scheduling system to generate a mark that indicates the data write task is completed. For example, you can enable the external scheduling system to generate a marker file with the suffix
.done
in the file system to indicate that the data write task is completed. Then, you can create an FTP Check node in the DataWorks scheduling system to periodically detect whether the marker file with the suffix .done
exists. If the file exists, the node that needs to access the external database can be scheduled. Note
- You can specify the file system that can be used to store the marker files.
- In this example, a marker file with the suffix
.done
is used. You can customize the information such as the format and name for your marker file.
Note External databases include but are not limited to Oracle, MySQL, and SQL Server.
Limits
- Only the China (Beijing), China (Shanghai), China (Hangzhou), China (Shenzhen), China (Zhangjiakou), China (Chengdu), and Singapore (Singapore) regions support FTP Check nodes.
- FTP Check nodes can run only on exclusive resource groups for scheduling.
- If an FTP Check node is scheduled by minute or hour, you can set the Check stop policy parameter only to Number of Check stops for the node.
Create an FTP Check node
- Go to the DataStudio page.
- Log on to the DataWorks console.
- In the left-side navigation pane, click Workspaces.
- In the top navigation bar, select the region in which the workspace that you want to manage resides. Find the workspace and click DataStudio in the Actions column.
- On the DataStudio page, move the pointer over the icon and choose . Alternatively, you can find the desired workflow, right-click the workflow name, and then choose.
- In the Create Node dialog box, configure the Name and Path parameters. Note The node name must be 1 to 128 characters in length and can contain letters, digits, underscores (_), and periods (.).
- Click Commit.
- Click the Properties tab in the right-side navigation pane and configure properties for the FTP Check node. The properties include basic properties, time properties, resource properties, and scheduling dependencies. For more information, see Configure basic properties, Configure time properties, Configure the resource property, and Configure same-cycle scheduling dependencies.
- Configure a detection object and a detection policy.
- Save and commit the node. Important You must configure the Rerun and Parent Nodes parameters on the Properties tab before you commit the node.
- Click the icon in the top toolbar to save the node.
- Click the icon in the toolbar.
- In the Commit Node dialog box, configure the Change description parameter.
- Click OK.
If the workspace that you use is in standard mode, you must click Deploy in the upper-right corner to deploy the node after you commit it. For more information, see Deploy nodes. - Perform O&M operations on the node. For more information, see Perform basic O&M operations on auto triggered nodes.