E-MapReduce (EMR) provides the Custom Software Settings feature for you to customize the configurations of software, such as Hadoop, Hive, and Pig, when you create a cluster. This topic describes how to customize software configurations.
Limits
You can use the Custom Software Settings feature only when you create a cluster.
Procedure
- Go to the Cluster Management page.
- Log on to the Alibaba Cloud EMR console.
- In the top navigation bar, select the region where your cluster resides and select a resource group based on your business requirements.
- Click the Cluster Management tab.
- Click Cluster Wizard in the upper-right corner.
- In the Advanced Settings section of the Software Settings step, turn on Custom Software Settings.
You can specify a configuration file in the JSON format and overwrite or add default cluster parameters. The following example shows the content of a configuration file in the JSON format:
[ { "ServiceName":"YARN", "FileName":"yarn-site", "ConfigKey":"yarn.nodemanager.resource.cpu-vcores", "ConfigValue":"8" }, { "ServiceName":"YARN", "FileName":"yarn-site", "ConfigKey":"aaa", "ConfigValue":"bbb" } ]
- ServiceName: the service name. You must specify the service name in all uppercase.
- FileName: the name of the file. The name of the file that is actually passed. You need to remove the suffix.
ConfigKey
: the name of a configuration item.ConfigValue
: the value of the configuration item.
The following table lists the configuration files of each service.Service Configuration file Hadoop - core-site.xml
- log4j.properties
- hdfs-site.xml
- mapred-site.xml
- yarn-site.xml
- httpsfs-site.xml
- capacity-scheduler.xml
- hadoop-env.sh
- httpfs-env.sh
- mapred-env.sh
- yarn-env.sh
Pig - pig.properties
- log4j.properties
Hive - hive-env.sh
- hive-site.xml
- hive-exec-log4j.properties
- hive-log4j.properties
After you customize software configurations, you can continue to create the cluster. For more information, see Create a cluster.