E-MapReduce (EMR) allows you to use bootstrap actions to install third-party software or modify the runtime environment of your clusters. This topic describes how to add, edit, clone, and remove bootstrap actions.

Background information

Bootstrap actions can help you perform many operations that are not supported by EMR clusters. For example, you can use bootstrap actions to perform the following operations:
  • Use Yellowdog Updater, Modified (YUM) to install software whose installation package is available.
  • Download public software from the Internet.
  • Read your data from Object Storage Service (OSS).
  • Install and run services, such as Flink or Impala.

Limits

  • You can add a maximum of 10 bootstrap actions. Bootstrap actions are performed based on the order that you specified.
  • By default, the script that you specified is run by the root user. You can add the su hadoop command to the script. This way, the script is run by the hadoop user.

Add a bootstrap action

You can use one of the following methods to add a bootstrap action:
  • Method 1: Add a bootstrap action when you create a cluster.
    1. Go to the Cluster Management page.
      1. Log on to the Alibaba Cloud EMR console.
      2. In the top navigation bar, select the region where your cluster resides and select a resource group based on your business requirements.
      3. Click the Cluster Management tab.
    2. Click Cluster Wizard in the upper-right corner.
    3. In the Advanced Settings section of the Basic Settings step, find Bootstrap Actions and click Add.
    4. In the Add Bootstrap Actions dialog box, configure the following parameters.
      Parameter Description
      Name The name of the bootstrap action that you want to add.
      Script Path The OSS path where the script file is located.

      You must specify this parameter in the oss://**/*.sh format.

      Parameter The parameter of the bootstrap action script. The parameter is used to specify the value of the variable that is referenced in the script.
      Target Nodes
      • Cluster: The bootstrap action is applicable to the entire cluster.
      • Host Group: The bootstrap action is applicable only to a specific machine group.

        You can select Core Instance Group, Master Instance Group, or an existing machine group.

      Execution At
      • Before Component Startup: The system runs the script before the deployed components are started.
      • After Component Startup: The system runs the script after the deployed components are started.
      Execution Failure Policy
      • Proceed: If the script fails to be run, the system continues to run the next script.
      • Stop: If the script fails to be run, the system stops running scripts.
    5. Click OK.
      Examples of bootstrap actions are provided in Examples for your reference.
      Note The added bootstrap action may fail to be performed. However, the failure does not affect the creation of the cluster.

      After the cluster is created, you can check the value of Bootstrap Actions/EMR Version in the Cluster Info section of the Cluster Overview page to determine whether an exception occurs. If an exception occurs, you can log on to each node of the cluster and view the operational log. Operational logs are stored in the /var/log/bootstrap-actions directory.

  • Method 2: Add a bootstrap action after you create a cluster.
    1. Go to the Bootstrap Actions page.
      1. Log on to the Alibaba Cloud EMR console.
      2. In the top navigation bar, select the region where your cluster resides and select a resource group based on your business requirements.
      3. Click the Cluster Management tab.
      4. In the left-side navigation pane, click Bootstrap Actions.
    2. On the Bootstrap Actions page, click Add Bootstrap Actions.
    3. In the Add Bootstrap Actions dialog box, configure the following parameters.
      Parameter Description
      Name The name of the bootstrap action that you want to add.
      Script Path The OSS path where the script file is located.

      You must specify this parameter in the oss://**/*.sh format.

      Parameter The parameter of the bootstrap action script. The parameter is used to specify the value of the variable that is referenced in the script.
      Target Nodes
      • Cluster: The bootstrap action is applicable to the entire cluster.
      • Host Group: The bootstrap action is applicable only to a specific machine group.

        You can select Core Instance Group, Master Instance Group, or an existing machine group.

      Execution At
      • Before Component Startup: The system runs the script before the deployed components are started.
      • After Component Startup: The system runs the script after the deployed components are started.
      Execution Failure Policy
      • Proceed: If the script fails to be run, the system continues to run the next script.
      • Stop: If the script fails to be run, the system stops running scripts.
    4. Click OK.

      Examples of bootstrap actions are provided in Examples for your reference.

Edit a bootstrap action

  1. Go to the Bootstrap Actions page.
    1. Log on to the Alibaba Cloud EMR console.
    2. In the top navigation bar, select the region where your cluster resides and select a resource group based on your business requirements.
    3. Click the Cluster Management tab.
    4. In the left-side navigation pane, click Bootstrap Actions.
  2. On the Bootstrap Actions page, find the bootstrap action that you want to edit and click Edit in the Actions column.
  3. In the Edit Bootstrap Action dialog box, modify the configuration items based on your business requirements. All configuration items can be modified.
  4. Click OK.

Clone a bootstrap action

  1. Go to the Bootstrap Actions page.
    1. Log on to the Alibaba Cloud EMR console.
    2. In the top navigation bar, select the region where your cluster resides and select a resource group based on your business requirements.
    3. Click the Cluster Management tab.
    4. In the left-side navigation pane, click Bootstrap Actions.
  2. On the Bootstrap Actions page, find the bootstrap action that you want to clone and click Clone in the Actions column.
  3. In the Clone Bootstrap Action dialog box, modify the configuration items based on your business requirements. All configuration items can be modified.
  4. Click OK.

Remove a bootstrap action

  1. Go to the Bootstrap Actions page.
    1. Log on to the Alibaba Cloud EMR console.
    2. In the top navigation bar, select the region where your cluster resides and select a resource group based on your business requirements.
    3. Click the Cluster Management tab.
    4. In the left-side navigation pane, click Bootstrap Actions.
  2. On the Bootstrap Actions page, find the bootstrap action that you want to remove and click Delete in the Actions column.
  3. In the Delete Bootstrap Action message, click OK.

Examples

When you add a bootstrap action, you must specify a bootstrap action name, the OSS path of a script file, and a parameter of the script based on your business requirements. When the bootstrap action is performed, each node downloads the script from the specified OSS path and runs the script directly or based on optional parameters. This section provides two examples:
  • Example 1
    You can specify the file that you want to download from OSS in the script. In this example, you can use the following script to download the <myfile>.tar.gz file from the oss://<yourbucket>/ directory and decompress the file to the /<yourdir> directory on your computer.
    Notice The OSS endpoint in the script can be an internal, public, or VPC endpoint. If you use the classic network, you must specify an internal endpoint. For example, the internal endpoint that corresponds to the China (Hangzhou) region is oss-cn-hangzhou-internal.aliyuncs.com. If you use a virtual private cloud (VPC), you must specify an endpoint that you can access from the VPC. For example, the endpoint that corresponds to the China (Hangzhou) region is vpc100-oss-cn-hangzhou.aliyuncs.com.
    #!/bin/bash
    osscmd --id=<yourid> --key=<yourkey> --host=oss-cn-hangzhou-internal.aliyuncs.com get oss://<yourbucket>/<myfile>.tar.gz ./<myfile>.tar.gz
    mkdir -p /<yourdir>
    tar -zxvf <myfile>.tar.gz -C /<yourdir>
  • Example 2
    You can use YUM to install additional system software. For example, you can use the following script to install ld-linux.so.2:
    #!/bin/bash
    yum install -y ld-linux.so.2