An SSH node connects to a remote host through a specified SSH data source and runs Shell scripts on that host on a recurring schedule. Use it to trigger scripts on remote servers—such as Elastic Compute Service (ECS) instances—directly from DataWorks workflows without manually logging in.
Prerequisites
Before you begin, make sure you have:
-
A RAM user added to your workspace with the Develop or Workspace Administrator role. The Workspace Administrator role has more permissions than necessary. Exercise caution when you assign the Workspace Administrator role. See Add workspace members and assign roles to them.
-
A serverless resource group associated with your workspace. See Use serverless resource groups.
-
An SSH node created. See Create an auto triggered task.
-
An SSH data source created using the Java Database Connectivity (JDBC) connection string mode and connected to the correct resource group. See Create an SSH data source.
Limitations
|
Limitation |
Detail |
|
Code size |
Scripts in an SSH node cannot exceed |
|
Shell syntax |
Standard Shell syntax only. Interactive Shell syntax is not supported. |
|
Data source mode |
Only SSH data sources created using the JDBC connection string mode are supported. |
Usage notes
Orphaned remote processes: When an SSH node task exits unexpectedly—for example, due to a timeout—the process on the remote host continues running. DataWorks does not send a termination command to the remote host.
Temporary files: Running scripts on an ECS instance generates temporary files on that instance. Make sure the instance has enough disk space and that the file count stays within the instance limits.
File conflicts: Avoid running multiple tasks that operate on the same file at the same time. Concurrent writes to the same file can cause SSH node exceptions.
Step 1: Develop the SSH node
Select an SSH data source
If your workspace has multiple SSH data sources, select the one you want to use on the configuration tab of the SSH node. If only one SSH data source exists, it is selected automatically.
Make sure the selected data source is connected to the correct resource group to prevent task failures.
Write the script
Write your script in the code editor. The SSH node runs the script on the remote host when the task is triggered.
Example: Basic script
The following example creates a file on the remote host and runs it.
# Create hello.sh in the /tmp directory of the remote host.
echo "echo hello,dataworks" >/tmp/hello.sh
# Run the file.
sh /tmp/hello.sh
Example: Script with scheduling parameters
Scheduling Parameters let you inject dynamic values into your script at runtime. Define variables in the script using ${variable_name}, then assign values in Scheduling Configuration > Scheduling Parameters in the right-side panel.
The following example writes the daily run time of the SSH node to a log file. The ${myDate} variable is assigned $[yyyy-mm-dd hh24:mi:ss] as its scheduling parameter value.
# Write the run time to /tmp/sshnode.log.
echo ${myDate} >/tmp/sshnode.log
cat /tmp/sshnode.log
For supported variable formats, see Scheduling configurations.
Configure scheduling properties
After writing the script, configure scheduling properties to run the SSH task on a recurring schedule. For details, see Scheduling configurations.
Step 2: Deploy and monitor the node
-
Submit and publish the SSH node to the production environment. For details, see Publish a node or workflow.
-
After publishing, the task runs automatically based on your scheduling configuration. To view and manage the task, go to . For details, see Get started with Operation Center.
What's next
To set up load balancing and high availability for SSH nodes, see Implement load balancing and high availability for SSH nodes.