All Products
Search
Document Center

DataWorks:SSH node

Last Updated:Mar 26, 2026

The SSH node in DataWorks connects to a remote host through a Secure Shell (SSH) data source and runs scripts on a schedule. A common use case is running scripts on an Elastic Compute Service (ECS) instance from DataWorks on a recurring basis.

Behavior and limitations

AspectDetail
Process isolationIf an SSH node task exits unexpectedly — for example, due to a timeout — the process running on the remote host continues unaffected. DataWorks does not send a stop command to the remote host.
Shell syntaxSSH nodes support standard Shell syntax. Interactive syntax is not supported.
Disk spaceWhen an SSH node runs a script on an ECS instance, it creates a temporary file on the instance. Make sure the instance has enough disk space and that the maximum file count limit meets your requirements.
File concurrencyAvoid running multiple tasks on the same file simultaneously. Concurrent access to the same file can cause task errors.
Resource groupSSH tasks run on Serverless resource groups. For more information, see Use Serverless resource groups.
Maximum code length128 KB per SSH node
Supported regionsChina (Hangzhou), China (Shanghai), China (Beijing), China (Shenzhen), China (Hong Kong), Japan (Tokyo), Singapore, Malaysia (Kuala Lumpur), Indonesia (Jakarta), Germany (Frankfurt), UK (London), US (Silicon Valley), and US (Virginia)

Prerequisites

Before you begin, make sure you have:

  • A workflow. Development operations in DataStudio are performed within workflows. For more information, see Create a workflow.

  • An SSH data source configured with a Java Database Connectivity (JDBC) connection string. Make sure the data source can reach the resource group over the network. For more information, see SSH data source.

  • (Required for RAM users) The RAM user is added to the workspace as a member with the Develop or Workspace Administrator role assigned. The Workspace Administrator role grants broader permissions than typically needed — assign it with caution. For more information, see Add workspace members and assign roles to them.

Step 1: Create an SSH node

  1. Go to the DataStudio page. Log on to the DataWorks console. In the top navigation bar, select the target region. In the left-side navigation pane, choose Data Development and O&M > Data Development. Select the target workspace from the drop-down list and click Go to Data Development.

  2. Right-click the target business flow and choose Create Node > SSH.

  3. In the Create Node dialog box, enter a Name for the node and click OK.

Step 2: Develop the SSH task

Select an SSH data source (optional)

If the workspace has multiple SSH data sources, select one on the SSH node editing page. If only one SSH data source exists, it is used automatically.

SSH nodes only support SSH data sources created using a JDBC connection string. Make sure the data source can reach the resource group over the network.

Run a shell script on a remote host

Write the task code in the code editor. The following example prepares and runs a shell script on the remote host:

# Step 1: Prepare the script on the remote host.
# For testing, create a nihao.sh file in the /tmp directory of the remote host.
echo "echo nihao,dataworks" > /tmp/nihao.sh

# Step 2: Run the script on the remote host.
sh /tmp/nihao.sh

Use scheduling parameters to pass dynamic values

DataWorks Scheduling Parameters let you inject dynamic values into recurring tasks. Define variables in your code using the ${variable_name} format, then assign values to them in Schedule > Parameters in the right-side navigation pane.

For supported variable formats, see Supported formats of scheduling parameters. For a step-by-step guide, see Configure and use scheduling parameters.

The following example writes the node's run time to a log file each day using the ${myDate} variable, with the value $[yyyy-mm-dd hh24:mi:ss] assigned in the Parameters section:

# Write the current run time to /tmp/sshnode.log.
echo ${myDate} > /tmp/sshnode.log
cat /tmp/sshnode.log

Step 3: Configure task scheduling properties

To run a task on a recurring schedule, click Properties in the right-side navigation pane and configure the scheduling properties to match your requirements. For more information, see Overview.

Configure the Rerun and Parent Nodes parameters on the Properties tab before committing the task.

Step 4: Debug task code

  1. (Optional) Select a resource group and assign values to scheduling parameters.

    • Click the 高级运行 icon in the toolbar. In the Parameters dialog box, select the resource group to use for debugging.

    • If the task code uses scheduling parameters, assign test values to the variables. For details on how value assignment works, see Debugging procedure.

  2. Save and run the task code. Click the 保存 icon to save the code, then click the 运行 icon to run it.

  3. (Optional) Perform smoke testing. When you commit the node or after you commit the node, you can run smoke testing in the development environment to verify the node behaves as expected. For more information, see Perform smoke testing.

Step 5: Commit and deploy the task

  1. Click the 保存 icon to save the task.

  2. Click the 提交 icon to commit the task. In the Submit dialog box, enter a Change description and choose whether to trigger a code review.

    • Configure the Rerun and Parent Nodes parameters on the Properties tab before committing.

    • If code review is enabled, the node can only be deployed after the code passes review. For more information, see Code review.

  3. (Standard mode workspaces only) Deploy the task to the production environment. Click Deploy in the upper-right corner of the node editing page. For more information, see Deploy tasks.

Monitor and manage the task

After the task is committed and deployed, it runs automatically on the configured schedule. Click Operation Center in the upper-right corner of the node editing page to view the scheduling status and manage task instances. For more information, see View and manage auto triggered tasks.

What's next