Shell nodes support the standard shell syntax but not the interactive syntax.
Limits
- Shell nodes support the standard shell syntax but not the interactive syntax.
- Shell nodes can be run only by using exclusive resource groups for scheduling.
- A Shell node that is run on an exclusive resource group for scheduling may need to access a data source for which a whitelist is configured. In this case, you must add the required elastic IP address (EIP) or CIDR block to the whitelist of the data source.
- Do not start a large number of subprocesses in a Shell node. If you start a large number of subprocesses in a Shell node that is run on an exclusive resource group for scheduling, other nodes that are run on the resource group may be affected because DataWorks does not impose a limit on the resource usage for running Shell nodes.
Create a common Shell node
Go to the DataStudio page.
Log on to the DataWorks console.
In the left-side navigation pane, click Workspaces.
In the top navigation bar, select the region where the workspace resides. On the Workspaces page, find the workspace in which you want to create tables, and click in the Actions column.
- Move the pointer over the icon and choose . Alternatively, you can click the name of the desired workflow in the Business Flow section, right-click General, and then choose.
Enable a Shell node to use resources
Before a node can use a resource in DataWorks, you must upload the resource to DataWorks and reference the resource in the runtime environment of the node. This section describes the procedure.
Upload a resource
DataWorks allows you to create a resource or upload an existing resource. You can select a method based on the GUIs of each type of resource.
- Go to the DataStudio page and create the desired type of resource for the Shell node in the desired workflow based on your business requirements. Note If no workflow is available, create one. For information about how to create a workflow, see Create a workflow.
- Commit and deploy the resource. Click the icon in the top toolbar to commit the resource to the development environment.Note If nodes in the production environment need to use this resource, you also need to deploy the resource to the production environment. For more information, see Deploy nodes.
Reference the resource in the node
To enable the node to use the resource, you must reference the resource in the node. After the resource is referenced, the @resource_reference{"Resource name"}
comment is displayed in the upper part of the node code. Procedure:
- Open the created node.
- In the Scheduled Workflow pane of the DataStudio page, find the resource that you uploaded.
- Right-click the resource and select Insert Resource Path to reference the resource in the current node.
Scheduling parameters used by Shell nodes
You are not allowed to customize variable names for common Shell nodes. The variables must be named based on their ordinal numbers, such as $1, $2, and $3. If the number of parameters reaches or exceeds 10, use ${Number} to declare the excess variables. For example, use ${10} to declare the tenth variable. For information about how to configure and use scheduling parameters, see Configure and use scheduling parameters. For information about the methods to assign values to scheduling parameters, see Supported formats of scheduling parameters.
In the preceding figure, custom parameters are assigned to the custom variables $1, $2, and $3 in the Parameters section, and the custom variables are referenced in the code editor. Examples:- $1: Specify $bizdate as $1. This variable is used to obtain the data timestamp. $bizdate is a built-in parameter.
- $2: Specify ${yyyymmdd} as $2. This variable is used to obtain the data timestamp.
- $3: Specify $[yyyymmdd] as $3. This variable is used to obtain the data timestamp.
How do I determine whether a custom Shell script is successfully run?
The exit code of the custom Shell script determines whether the script is successfully run. Exit codes:
- 0: indicates that the custom Shell script is successfully run.
- -1: indicates that the custom Shell script is terminated.
- 2: indicates that the custom Shell script needs to be automatically rerun.
- Other exit codes: indicate that the custom Shell script fails to run.
#! /bin/bash
curl http://xxxxx/asdasd
echo "nihao"
The Shell script is successfully run because the script exited as expected.
#! /bin/bash
curl http://xxxxx/asdasd
if [[ $? == 0 ]];then
echo "curl success"
else
echo "failed"
exit 1
fi
echo "nihao"
In this case, the script fails to run.
Use a Shell script to access OSSUtils
- /home/admin/usertools/tools/ossutil64.
- For information about the common commands in OSSUtils, see Common commands.
[Credentials]
language = CH
endpoint = oss.aliyuncs.com
accessKeyID = your_accesskey_id
accessKeySecret = your_accesskey_secret
stsToken = your_sts_token
outputDir = your_output_dir
ramRoleArn = your_ram_role_arn
Command syntax:#! /bin/bash
/home/admin/usertools/tools/ossutil64 --config-file /home/admin/usertools/tools/myconfig cp oss://bucket/object object
if [[ $? == 0 ]];then
echo "access oss success"
else
echo "failed"
exit 1
fi
echo "finished"
Subsequent operations
If the Shell node needs to be periodically scheduled, you need to define the scheduling properties for the Shell node and deploy the node to the production environment. For information about how to configure scheduling properties for nodes, see Configure scheduling properties for nodes. For information about how to deploy nodes to the production environment, see Deploy nodes.