Bootstrap action is used to run your customized script before the cluster starts up Hadoop. The customized script is used to install your required third-party software or change the cluster operating environment.
Function of bootstrap operation
With bootstrap action, you can install many things to your cluster that are not currently supported by clusters. For example:
- Install provided software with Yum.
- Directly download open software from a public network.
- Read your data from OSS.
- Install and operate a service, such as Flink or Impala, but the script to be compiled is more complex.
We strongly recommend that you test the bootstrap action with a Pay-As-You-Go cluster and create a subscription cluster only after the test is successful.
How to use
- Log on to the Alibaba Cloud E-MapReduce console.
- Select the region where the created cluster associated with the region is listed.
- Click Create Cluster to enter the cluster creation page.
- At the end of the basic configuration page, click Add to enter the operation page.
- Enter the configuration items.
You can add up to 16 bootstrap actions to be performed during cluster initialization in the designated sequence. By default, your designated script is run with the root account. You can switch to a Hadoop account with su hadoop in the script.
It is possible that the bootstrap action fails. For ease of use, bootstrap action failure does not affect the creation of the cluster. After the cluster is created successfully, you can view any abnormality in the Bootstrap/software configuration column of cluster information in the cluster details page. In case of any abnormality, you can log on to all nodes to view the operation logs in the directory of /var/log/bootstrap-actions.
Bootstrap action type
The bootstrap action is categorized into customized bootstrap action and operating-condition bootstrap action. The main difference is that the operating-condition bootstrap action can only operate your designated operation in the node that meets the requirements.
Customized bootstrap action
For the customized bootstrap action, the position of the bootstrap action name and the execution script in OSS must be designated and the optional parameters are set as required. During cluster initialization, all nodes download the designated OSS scripts to run them directly or after adding the optional parameters.
#! #!/bin/bash osscmd --id=<yourid> --key=<yourkey> --host=oss-cn-hangzhou-internal.aliyuncs.com get oss://<yourbucket>/<myfile>.tar.gz ./<myfile>.tar.gz mkdir -p /<yourdir> tar -zxvf <myfile>.tar.gz -C /<yourdir>
The osscmd has been preinstalled on the node and can be invoked directly to download the file.
#! #!/bin/bash yum install -y ld-linux.so. 2
Operating-condition bootstrap action
instance.isMaster=true mkdir -p /tmp/abc
If multiple operation commands are designated, you can divide several statements with the semicolon “;”. For example:
instance.isMaster=true mkdir -p /tmp/abc;mkdir -p /tmp/def.