All Products
Search
Document Center

DataWorks:SSH node

Last Updated:Oct 28, 2025

The DataWorks Secure Shell (SSH) node enables you to specify an SSH data source to remotely access the associated host and run scripts. For example, you can remotely access an Elastic Compute Service (ECS) instance from DataWorks and run scripts on a recurring schedule. This topic describes how to develop tasks using an SSH node.

Usage notes

  • If you use an SSH node to start a process on a remote host, the process on the remote host is not affected if the SSH node task exits unexpectedly, such as due to a timeout. DataWorks does not send a command to the remote host to stop the process.

  • SSH nodes support standard Shell syntax but not interactive syntax.

  • When you use an SSH node to remotely run a script on an ECS instance, a temporary file is created on the instance. Ensure that the ECS instance has sufficient disk space and that the maximum file count limit meets your requirements.

  • Avoid having multiple tasks operate on the same file at the same time. This can cause errors in the SSH node.

Prerequisites

  • A workflow is created.

    Development operations in different types of compute engines are performed based on workflows in DataStudio. Therefore, before you create a node, you must create a workflow. For more information, see Create a workflow.

  • An SSH data source is created.

    You must create an SSH data source to remotely access your SSH server. This lets you develop and schedule recurring SSH tasks in an SSH node. For more information, see SSH data source.

    Note

    SSH nodes only support SSH data sources created using a Java Database Connectivity (JDBC) connection string. To prevent task failures, ensure that the data source can connect to the resource group over the network.

  • (Required if you use a RAM user to develop tasks) The RAM user is added to the DataWorks workspace as a member and is assigned the Develop or Workspace Administrator role. The Workspace Administrator role has more permissions than necessary. Exercise caution when you assign the Workspace Administrator role. For more information about how to add a member and assign roles to the member, see Add workspace members and assign roles to them.

Limitations

  • You can run SSH tasks on Serverless resource groups. For more information, see Use Serverless resource groups.

  • Supported regions: China (Hangzhou), China (Shanghai), China (Beijing), China (Shenzhen), China (Hong Kong), Japan (Tokyo), Singapore, Malaysia (Kuala Lumpur), Indonesia (Jakarta), Germany (Frankfurt), UK (London), US (Silicon Valley), and US (Virginia).

  • The maximum length of code that can be run in an SSH node is 128 KB.

Step 1: Create an SSH node

  1. Go to the DataStudio page.

    Log on to the DataWorks console. In the top navigation bar, select the desired region. In the left-side navigation pane, choose Data Development and O&M > Data Development. On the page that appears, select the desired workspace from the drop-down list and click Go to Data Development.

  2. Right-click the target business flow and choose Create Node > SSH.

  3. In the Create Node dialog box, enter a Name for the node and click OK. The node is created. You can now develop and configure the task in the node.

Step 2: Develop the SSH task

(Optional) Select an SSH data source

If you created multiple SSH data sources in your workspace, you must select a data source on the SSH node editing page. If you created only one SSH data source, it is used by default.

Note

SSH nodes only support SSH data sources created using a Java Database Connectivity (JDBC) connection string. To prevent task failures, ensure that the data source can connect to the resource group over the network.

Develop code: Simple example

Write the task code in the code editor of the SSH node. The following code is an example.

# 1. Prepare the environment.
# Find the file you want to run on the remote host. For example, the nihao.sh file exists in the /tmp directory of the remote host.
# For testing purposes, you can run the following command in the SSH node to create the nihao.sh file.
echo "echo nihao,dataworks" >/tmp/nihao.sh
# 2. Use the SSH node to trigger the file on the remote host.
# Use the DataWorks SSH node to trigger the /tmp/nihao.sh file.
sh /tmp/nihao.sh

Develop code: Use scheduling parameters

DataWorks provides Scheduling Parameters that allow you to use dynamic request parameters in recurring scheduling scenarios. You can define variables in your node task code using the ${variable_name} format and assign values to the variables in the Schedule > Parameters section in the right-side navigation pane of the node editing page. For more information about the supported formats and configurations of scheduling parameters, see Supported formats of scheduling parameters and Configure and use scheduling parameters.

The following example shows how to use scheduling parameters in an SSH node.

# Requirement: Write the running time of the SSH node to the /tmp/sshnode.log file every day.
# Implementation: Use the ${myDate} variable for the sshnode.log file and assign the value $[yyyy-mm-dd hh24:mi:ss] to the myDate variable. This writes the running time of the SSH node to the file.
echo ${myDate} >/tmp/sshnode.log
cat /tmp/sshnode.log

Step 3: Configure task scheduling properties

If you want the system to periodically run a task on the node, you can click Properties in the right-side navigation pane on the configuration tab of the node to configure task scheduling properties based on your business requirements. For more information, see Overview.

Note

You must configure the Rerun and Parent Nodes parameters on the Properties tab before you commit the task.

Step 4: Debug task code

You can perform the following operations to check whether the task is configured as expected based on your business requirements:

  1. Optional. Select a resource group and assign custom parameters to variables.

    • Click the 高级运行 icon in the top toolbar of the configuration tab of the node. In the Parameters dialog box, select a resource group for scheduling that you want to use to debug and run task code.

    • If you use scheduling parameters in your task code, assign the scheduling parameters to variables as values in the task code for debugging. For more information about the value assignment logic of scheduling parameters, see Debugging procedure.

  2. Save and run task code.

    In the top toolbar, click the 保存 icon to save task code. Then, click the 运行 icon to run task code.

  3. Optional. Perform smoke testing.

    When you commit the node or after you commit the node, you can perform smoke testing on the node in the development environment to check whether the node is run as expected. For more information, see Perform smoke testing.

Step 5: Commit and deploy the task

After a task on a node is configured, you must commit and deploy the task. After you commit and deploy the task, the system runs the task on a regular basis based on scheduling configurations.

  1. Click the 保存 icon in the top toolbar to save the task.

  2. Click the 提交 icon in the top toolbar to commit a task on the node.

    In the Submit dialog box, configure the Change description parameter. Then, determine whether to review task code after you commit the task based on your business requirements.

    Note
    • You must configure the Rerun and Parent Nodes parameters on the Properties tab before you commit the task.

    • You can use the code review feature to ensure the code quality of tasks and prevent task execution errors caused by invalid task code. If you enable the code review feature, the node code that is committed can be deployed only after the node code passes the code review. For more information, see Code review.

If you use a workspace in standard mode, you must deploy the task in the production environment after you commit the task. To deploy a task on a node, click Deploy in the upper-right corner of the configuration tab of the node. For more information, see Deploy tasks.

More operations

Task O&M: After you commit and deploy the task, the task is periodically run based on the scheduling configurations. You can click Operation Center in the upper-right corner of the configuration tab of the corresponding node to go to Operation Center and view the scheduling status of the task. For more information, see View and manage auto triggered tasks.

References

For more information about how to implement load balancing and high availability for SSH nodes, see Implement load balancing and high availability for SSH nodes.