Bootstrap actions let you automatically run custom shell scripts on cluster nodes during cluster creation, scale-out, or auto scaling. Use them to install third-party software, modify runtime configurations, or set up services that EMR does not support natively.
For information about running scripts on existing nodes, see Manually execute scripts.
How it works
When a cluster node starts, EMR downloads your script from an Object Storage Service (OSS) bucket and runs it at the stage you specify:
Before component installation — runs before any services are installed
Before component startup — runs after services are installed but before they start
After component startup — runs after services are running
Bootstrap actions run in the order you define them. If a script fails, EMR either stops the process (blocking cluster creation or scale-out) or continues to the next script, depending on the failure policy you set.
Common use cases include:
Installing software using YUM (when the installation package is available)
Downloading public software from the Internet
Reading data from OSS
Installing and running services such as Flink or Impala
Limitations
Each cluster supports a maximum of 10 bootstrap actions.
Scripts run as
rootby default. To switch to the Hadoop user, runsu - hadoopinside the script.
Prerequisites
Before you begin, ensure that you have:
An OSS bucket in the same region as your cluster, with your script file uploaded
The script file stored at a path in
oss://<bucket>/<path>.shformat
Add a bootstrap action
During cluster creation
Log on to the EMR console. In the left-side navigation pane, click EMR on ECS.
In the top navigation bar, select a region and a resource group.
Click Create Cluster.
In the Basic Configuration step, scroll to the Advanced Settings section, find Bootstrap Actions, and click Add Bootstrap Action.
Configure the following parameters.
Parameter Description Action name A display name for this bootstrap action. Script path The OSS path to your script file. Format: oss://<bucket>/<path>.sh.Parameter Optional arguments passed to the script. Use this to set variable values referenced inside the script. Execution time When the script runs: Before component installation, Before component startup, or After component startup. Execution failure policy Proceed: if the script fails, continue with the next script. Stop: if the script fails, abort cluster creation or scale-out. Execution scope Cluster: run on all nodes. Node group type: run only on nodes of the specified group types. Click OK.
After the cluster is created, check the Script Operation tab to confirm the script ran without errors. If an error occurred, see View execution logs for troubleshooting.
The added bootstrap action may fail to be executed. However, the failure does not affect the creation of the cluster. Check the Script Operation tab after creation.
After cluster creation
Log on to the EMR console. In the left-side navigation pane, click EMR on ECS.
In the top navigation bar, select a region and a resource group.
Find the target cluster and click Services in the Actions column.
Click the Script Operation tab, then click the Bootstrap Actions tab.
Click Add Bootstrap Action.
In the Add Bootstrap Action dialog box, configure the following parameters.
Parameter Description Name A display name for this bootstrap action. Script address The OSS path to your script file. Format: oss://<bucket>/<path>.sh.Parameter Optional arguments passed to the script. Use this to set variable values referenced inside the script. Execution scope Cluster: run on all nodes. Node group type: run only on nodes of the specified group types. Node group: run only on the specific node groups you select. Execution time When the script runs: Before component installation, Before component startup, or After component startup. Execution failure policy Proceed: if the script fails, continue with the next script. Stop: if the script fails, abort cluster creation or scale-out. 
Click OK.
Manage existing bootstrap actions
On the Bootstrap Actions tab, you can edit, clone, or delete a bootstrap action from the Actions column.
View execution logs
Add logging statements at key points in your script so you can trace script behavior from the operational log.
Log on to the EMR console. In the left-side navigation pane, click EMR on ECS.
In the top navigation bar, select a region and a resource group.
Find the target cluster and click Services in the Actions column.
On the Script Operation tab, find the script and click View Execution Result in the Actions column.
In the Operation History panel, locate the relevant operation record and click
to expand activity details.DataLake, Dataflow, OLAP, DataServing, and custom clusters: look for
createNodeGrouporincreaseNodeGrouprecords. Tasks prefixed withRUN_BOOTSTRAP_CLUSTER_SCRIPT_<action-name>_<action-ID>Hadoop, Data Science, and EMR Studio clusters: look for
CREATE_CLUSTERorRESIZE_CLUSTERrecords. UnderpollDeployTaskStatusActivity, tasks namedRUN_SCRIPT_HOST_**are bootstrap action activities. View Stdout and Stderr logs.
Examples
Each node downloads the script from the specified OSS path and runs it directly or with the parameters you define. The following examples cover common use cases.
Example 1: Download a file from OSS and decompress it
You can specify the file to download from OSS directly in the script. The following scripts download the <myFile>.tar.gz file from OSS and decompress it to the /<yourDir> directory. Choose the appropriate script based on your cluster type.
The OSS endpoint in your script must be accessible from your cluster's network. Use an internal endpoint for classic networks (for example, oss-cn-hangzhou-internal.aliyuncs.com for the China (Hangzhou) region) and a virtual private cloud (VPC) endpoint for VPC networks (for example, vpc100-oss-cn-hangzhou.aliyuncs.com).
DataLake, Dataflow, OLAP, DataServing, and custom clusters (uses ossutil64):
#!/bin/bash
ossutil64 cp oss://<yourBucket>/<myFile>.tar.gz ./ \
-e oss-cn-hangzhou-internal.aliyuncs.com \
-i <yourAccessKeyId> \
-k <yourAccessKeySecret>
mkdir -p /<yourDir>
tar -zxvf <myFile>.tar.gz -C /<yourDir>Hadoop clusters (uses osscmd):
#!/bin/bash
osscmd --id=<yourAccessKeyId> \
--key=<yourAccessKeySecret> \
--host=oss-cn-hangzhou-internal.aliyuncs.com \
get oss://<yourBucket>/<myFile>.tar.gz ./
mkdir -p /<yourDir>
tar -zxvf <myFile>.tar.gz -C /<yourDir>Replace the following placeholders:
| Placeholder | Description | Example |
|---|---|---|
<yourBucket> | OSS bucket name | my-emr-bucket |
<myFile> | File name in the bucket | my-package |
<yourDir> | Local directory to decompress into | opt/myapp |
<yourAccessKeyId> | AccessKey ID | LTAI5tXxx |
<yourAccessKeySecret> | AccessKey secret | xXxXxXx |
Example 2: Install system software with YUM
#!/bin/bash
yum install -y ld-linux.so.2Troubleshooting
Script interrupted with no error in the logs
This usually means the script exited unexpectedly before writing to the log. The most common causes:
Network issue: The ECS instances and the OSS bucket must be in the same region. A cross-region connection attempt (for example, a China (Beijing) ECS instance accessing an OSS bucket outside China (Beijing)) silently fails.
Missing IAM role: If the cluster nodes cannot get the AccessKey pair, check that the ECS instances are assigned the
AliyunECSInstanceForEMRRolerole.`nohup` without output redirection: If your script uses
nohupbut does not redirect output, the process may hang indefinitely. Use the formnohup <command> ><logfile> 2>&1.
Add logging at key checkpoints to identify where execution stops:
#!/bin/bash
echo "Step 1: Downloading file..." >> /tmp/bootstrap.log 2>&1
ossutil64 cp oss://<yourBucket>/<myFile>.tar.gz ./ -e <endpoint> >> /tmp/bootstrap.log 2>&1
echo "Step 1: Done" >> /tmp/bootstrap.log 2>&1^M in error logs (Windows line endings)
If the Operation History error log contains ^M, the script was saved with Windows-style CRLF line endings, which cause errors in the Linux environment. Fix the line endings before uploading to OSS:
# Option 1: Use tr
tr -d '\r' < your-script.sh > your-script-fixed.sh
# Option 2: Use perl
perl -pi -e 's/\r\n/\n/g' your-script.shYARN or HDFS commands not found
By default, scripts run without loading the system profile, so Hadoop-related commands are not in the PATH. Add the following line at the start of your script:
. /etc/profileThere must be a space between . and /etc/profile. Without the space, the command will not work.