Run a script on multiple nodes in an existing cluster to install software, configure services, or download data from Object Storage Service (OSS). This feature is designed for long-running clusters. For temporary clusters, use bootstrap actions instead. For more information, see Use bootstrap actions to run scripts.
Common use cases:
-
Installing packages with Yellowdog Updater, Modified (YUM)
-
Downloading software from the Internet
-
Reading data from OSS
-
Installing and running services that require complex scripts, such as the Pig component
Prerequisites
Before you begin, make sure that you have:
-
An E-MapReduce cluster in the Running state. Scripts cannot run if the cluster is in any other state. To create a cluster, see Create a cluster.
-
A script uploaded to OSS. For an example, see Example.
Limitations
| Limitation | Details |
|---|---|
| Retention period | Script execution records are retained for up to 60 days. |
| Concurrency | Only one script can run in a cluster at a time. If a script is running, you cannot submit a new one. |
| Partial failure | A script may succeed on some nodes and fail on others. After you resolve the issue, rerun the script on only the failed nodes. After a scale-out, run the script on only the new nodes. |
Run a script manually
-
Log on to the EMR console. In the left-side navigation pane, click EMR on ECS.
-
In the top navigation bar, select a region and a resource group.
-
Find the target cluster and click Services in the Actions column.
-
Click the Script Operation tab, then click the Manual Execution tab.
-
Click Create And Execute.
-
In the dialog box, configure the following parameters, then click OK.
Parameter Description Name A name for the script. Script Location The OSS path to the script. The path must follow the format oss:///.sh.Execution Scope The nodes on which to run the script. Test on a single node first. After the test succeeds, run the script on the entire cluster. Note-
When you use the cluster script feature, test the script on a single node first. After the test is successful, run the script on the entire cluster.
-
The script path must be in the oss://**/*.sh format.
-
After you click OK, the script is added to the script list with the status Running.
Check the execution result
After the script is created, monitor and manage it from the script list.
| Action | How |
|---|---|
| View script details | Click Details in the Actions column. |
| View per-node status | Click View Execution Result. |
| Delete the script | Click Delete in the Actions column. |
Overall script statuses: Running, Complete, Submit Failed
Per-node statuses: Waiting, Running, Complete, Failed, Submit Failed, Canceled
Example
The following examples show common script patterns. By default, scripts run as the root user. To switch to the hadoop user, add su hadoop to your script.
Download a file from OSS and decompress it
#!/bin/bash
osscmd --id=<yourAccessKeyId> --key=<yourAccessKeySecret> --host=oss-cn-hangzhou-internal.aliyuncs.com get oss://<yourBucketName>/<yourFile>.tar.gz ./<yourFile>.tar.gz
mkdir -p /<yourDir>
tar -zxvf <yourFile>.tar.gz -C /<yourDir>
Replace the following placeholders with actual values:
| Placeholder | Description |
|---|---|
<yourAccessKeyId> |
Your Alibaba Cloud AccessKey ID |
<yourAccessKeySecret> |
Your Alibaba Cloud AccessKey Secret |
<yourBucketName> |
The OSS bucket name |
<yourFile> |
The file name (without the .tar.gz extension) |
<yourDir> |
The local directory to decompress the file into |
For the --host parameter, use the OSS endpoint that matches your network type:
-
Internal endpoint (classic network):
oss-cn-hangzhou-internal.aliyuncs.com -
VPC endpoint:
vpc100-oss-cn-hangzhou.aliyuncs.com
Install a system package with YUM
#!/bin/bash
yum install -y ld-linux.so.2