When you use E-MapReduce (EMR), cluster instability may occur or clusters may become unavailable due to unexpected operations. Take note of the information in this topic to avoid these issues. This topic describes the limits of EMR.

Notice When you use EMR, you must perform all operations in the EMR console. We recommend that you do not perform operations in the Elastic Compute Service (ECS) console. This may cause cluster instability or abnormalities. Take note of the information in this topic. In the case that you perform the operations that are prohibited, you will bear the consequences and responsibilities.

High-risk operations (prohibited)

Operation Possible result Suggestion
Delete or modify the hosts file that is stored in the etc/ directory. You cannot find the services that run on the nodes of your cluster, which causes service exceptions. Add information to the hosts file.
Modify parameters in component configuration files in the ECS console. After specific services are restarted, the settings of the parameters that are modified are overwritten. Modify the parameters in the EMR console.
Attach disks to the nodes of your EMR cluster in the ECS console. The disks are unavailable because EMR cannot recognize and initialize the disks. Add data disks in the EMR console.
Detach disks from the nodes of your EMR cluster in the ECS console. This may cause data loss because EMR is unaware of the disk detaching operation. Perform the following operations on specific nodes:
  • Core nodes: Disable the HDFS, YARN, HBase, Kudu, or Kafka component. Then, stop and disconnect the nodes from your cluster.
  • Task nodes: Disable the YARN component. Then, stop and disconnect the nodes from your cluster.
Remove core nodes in the ECS console. This causes data loss, and execution failures of jobs on the removed nodes. Disable the HDFS, YARN, HBase, Kudu, or Kafka component.
Remove master nodes in the ECS console.
  • For an HA cluster, if you remove master nodes, the switchover of HDFS NameNode HA, YARN ResourceManager, or HBase HMaster fails. In this case, you must purchase a new EMR cluster to migrate data or tasks.
  • For a non-HA cluster, if you remove the master node, the cluster becomes unavailable, and you cannot migrate data or tasks.
N/A
Remove task nodes in the ECS console. The jobs that you run on the removed nodes fail. Stop the NodeManager of YARN.
Stop the MySQL service of the master node. (Type is set to Built-in MySQL when you create an EMR cluster.) The MySQL service deployed on the emr-header-1 node is associated with Hive MetaStore, Oozie, and Ranger. If you stop the MySQL service, the associated components cannot access the specific database. N/A
Change the password of the root user that is used to access the MySQL service deployed on the emr-header-1 node. (Type is set to Built-in MySQL when you create an EMR cluster.) The associated component such as Hue or Ranger fails. N/A
Modify the security group to which ECS instances belong when an EMR cluster is running.
  • The network connection between nodes is abnormal.
  • Components become unavailable.
N/A

FAQ

Problem description Solution
Insufficient disk capacity Increase the capacity of a single disk or add core nodes in the EMR console. EMR clusters do not support the addition of disks.
Excess disk capacity Purchase a new cluster and release the original one. For more information, see Create a cluster. EMR clusters do not support scale-down of disk capacity.
Insufficient computing capabilities Add task nodes in the EMR console. For more information, see Scale out a cluster.
Excess computing capabilities
Solutions to clusters with different billing methods:
  • For a pay-as-you-go cluster, remove task nodes from the EMR console.
  • For a subscription cluster, stop the NodeManager of YARN on a specific task node, change the billing method of the ECS instance that serves as the task node to pay-as-you-go in the ECS console, and then release the instance.
Outdated component versions Purchase a cluster of a later version. For more information, see Create a cluster. Existing clusters do not support the version update of a single component.
Conversion from a non-HA cluster to an HA cluster We recommend that you purchase an HA cluster. Existing non-HA clusters cannot be converted into HA clusters.
Deployment of third-party software or services on EMR We recommend that you perform bootstrap actions to install third-party software or services when you create a cluster.

If you manually install third-party software or services after you create a cluster, when you add nodes, you must manually install the third-party software or services again on the added nodes.