All Products
Search
Document Center

E-MapReduce:Customize software configurations

Last Updated:Jan 30, 2024

Services such as YARN and Hive contain a large number of configuration items. You can use the Custom Software Configuration feature provided by E-MapReduce (EMR) to modify existing configurations or add configuration items when you create a cluster.

Limits

You can use the Custom Software Configuration feature only when you create a cluster.

Procedure

  1. Log on to the EMR console. In the left-side navigation pane, click EMR on ECS.

  2. In the top navigation bar, select the region in which you want to create a cluster and select a resource group based on your business requirements.

  3. On the EMR on ECS page, click Create Cluster.

  4. In the Advanced Settings section of the Software Configuration step, click Show More and turn on Custom Software Configuration. Custom software

    You can add a configuration file that is in the JSON format to overwrite or add the default parameters for the services in a cluster when you create the cluster. The following sample code provides an example of the content of a configuration file in the JSON format:

    [
        {
            "ApplicationName":"YARN",
            "ConfigFileName":"yarn-site.xml",
            "ConfigItemKey":"yarn.nodemanager.resource.cpu-vcores",
            "ConfigItemValue":"8"
        },
        {
            "ApplicationName":"YARN",
            "ConfigFileName":"yarn-site.xml",
            "ConfigItemKey":"aaa",
            "ConfigItemValue":"bbb"
        }
    ]
    • The following table describes the parameters in the preceding configuration file.

      Parameter

      Description

      ApplicationName

      The name of the service. You must specify the service name in all uppercase.

      ConfigFileName

      The name of the configuration file, which must be the name of the file that is actually passed.

      Note

      To ensure that a configuration file can be correctly applied to the desired cluster, take note of the naming details of the configuration file when the configuration file is passed.

      • A file name extension is required for a configuration file that is applied to a DataLake cluster, a Dataflow cluster, an online analytical processing (OLAP) cluster, a DataServing cluster, or a custom cluster. Example: yarn-site.xml.

      • A file name extension is not required for a configuration file that is applied to a Hadoop cluster. Example: yarn-site.

      ConfigItemKey

      The name of a configuration item.

      ConfigItemValue

      The value of the configuration item.

    • The following table lists the configuration files of each service.

      Service

      Configuration file

      YARN

      • core-site.xml

      • log4j.properties

      • hdfs-site.xml

      • mapred-site.xml

      • yarn-site.xml

      • httpsfs-site.xml

      • capacity-scheduler.xml

      • hadoop-env.sh

      • httpfs-env.sh

      • mapred-env.sh

      • yarn-env.sh

      Hive

      • hive-env.sh

      • hive-site.xml

      • hive-exec-log4j.properties

      • hive-log4j.properties

    After you customize software configurations, you can continue to create the cluster. For more information, see Create a cluster.

References

After a cluster is created, you can adjust the settings of configuration items on the Configure tab of the desired service page. For more information, see Manage configuration items.