After you create a cluster, you can use the manual script execution feature to manually execute a specific script on multiple nodes in the cluster at the same time based on your business requirements. This topic describes how to manually add and execute a script.
Background information
The manual script execution feature allows you to immediately execute a specific script on multiple nodes in a cluster at the same time. This feature is suitable for longstanding clusters. For temporary clusters, we recommend that you use bootstrap actions to initialize the clusters. For information about bootstrap actions, see Use bootstrap actions to execute scripts.
Manually executed scripts are similar to bootstrap action scripts. After you create a cluster, you can use the manual script execution feature to install software and services that are not pre-installed in EMR clusters. Examples:
Use YUM to install software whose installation package is available.
Download public software from the Internet.
Read your data from Object Storage Service (OSS).
Install and run a service, such as Pig. In this case, the script that you must write is complex.
Prerequisites
An E-MapReduce (EMR) cluster is created. For more information, see Create a cluster.
The cluster is in the Running state. Scripts cannot be executed in clusters that are in other states.
Cluster scripts are developed or obtained and uploaded to OSS. For more information about cluster scripts, see Example.
Usage notes
The records of manually executed scripts are retained for up to 60 days.
Only one cluster script can be executed in a cluster at a specific point in time. You cannot submit another cluster script if one is already in progress.
A cluster script may succeed on some nodes, but fail on others. For example, the script may fail to be executed because you restart a node. After you resolve the issue, you can re-execute the cluster script on the failed nodes. After you scale out a cluster, you can also execute cluster scripts on the added nodes.
Procedure
Go to the Script Operation tab.
Log on to the EMR console. In the left-side navigation pane, click EMR on ECS.
In the top navigation bar, select a region and a resource group based on your business requirements.
Find the cluster that you want to manage and click Services in the Actions column.
Click the Script Operation tab.
On the Script Operation tab, click the Manual Execution tab.
Click Create and Execute.
In the Add Manual Execution Script dialog box, configure the Name parameter, select the path where the script is located from the Script Address drop-down list, select the nodes on which you want to execute the script for the Execution Node parameter, and enter custom parameters in the Parameter field.
NoteWe recommend that you test the script on a single node before you execute the script for the entire cluster.
The script path must be in the oss://**/*.sh format.
Click OK.
After the script is created, the script is displayed in the cluster script list and is in the Running state. A script can be in the Running, Complete, or Submit Failed state.
To view the details of the script, click Details in the Actions column.
To view the script execution result, click View Execution Result in the Actions column.
The nodes on which you execute the script can be in the Waiting, Running, Complete, Failed, Submit Failed, or Cancel state.
To delete the script, click Delete in the Actions column.
Example
Similar to bootstrap action scripts, you can specify the object that you want to download from OSS in a manual execution script. For example, you can download a sample object in a directory that is in the oss://<yourBucket>/<myFile>.tar.gz format to your on-premises machine and decompress the object to the /yourDir directory.
#!/bin/bash
osscmd --id=<yourAccessKeyId> --key=<yourAccessKeySecret> --host=oss-cn-hangzhou-internal.aliyuncs.com get oss://<yourBucketName>/<yourFile>.tar.gz ./<yourFile>.tar.gz
mkdir -p /<yourDir>
tar -zxvf <yourFile>.tar.gz -C /<yourDir>
The specified OSS address can be an internal, public, or virtual private cloud (VPC) endpoint. If the classic network is used, you must specify an internal endpoint. For example, the internal endpoint of OSS in the China (Hangzhou) region is oss-cn-hangzhou-internal.aliyuncs.com. If a VPC is used, you must specify a domain name that you can access from the VPC. For example, the domain name of OSS in the China (Hangzhou) region is vpc100-oss-cn-hangzhou.aliyuncs.com.
You can also use YUM to install additional system software packages, such as ld-linux.so.2.
#!/bin/bash
yum install -y ld-linux.so.2
By default, the root account is used to execute specified scripts on nodes in a cluster. You can run the su hadoop
command in the script to switch to the hadoop user.