All Products
Search
Document Center

E-MapReduce:Scale in a core node group

Last Updated:Mar 18, 2025

If the load of your E-MapReduce (EMR) cluster remains at a low level for an extended period and large amounts of cluster resources are in an idle state, you can scale in the core node group of the cluster to prevent resource waste. This topic describes how to scale in a core node group.

Limits

  • You cannot scale in the core node group of a subscription cluster.

  • You cannot scale in the core node group of a Hadoop cluster in which number of core nodes is the same as the number of Hadoop Distributed File System (HDFS) replicas.

Precautions

  • The operations in this topic cannot be rolled back. The components of a service cannot be recovered after you unpublish the components.

  • This topic describes the best practices for scale-in operations. We recommend that you evaluate the impacts on your business before you scale in a node group and proceed with caution. This can help prevent job scheduling failures and data security risks.

Step 1: Select the node that you want to remove

Select the node that you want to remove from your cluster based on the service load of the cluster. You can use one of the following methods to view the resource usage of a cluster and select the node that you want to remove:

Method 1: View specific metrics on the Metric Monitoring subtab of the Monitoring and Diagnostics tab of the EMR console

  1. On the Metric Monitoring subtab, select YARN-Queues from the Dashboard drop-down list, find the AvailableVCores metric, and then view the value of the metric. If the value of the AvailableVCores metric is large, a large number of available cores exist in the queue. In this case, you can scale in the core node group of the cluster.

  2. On the Metric Monitoring subtab, select YARN-NodeManagers from the Dashboard drop-down list, find the AvailableGB metric, and then view the value of the metric. If the value of the AvailableGB metric is large, the nodes on which NodeManager is deployed have a large amount of available memory resources. In this case, you can remove the nodes.

Note

You can determine whether a node needs to be removed based on your business requirements and specific metrics.

Method 2: View the resource usage on the web UI of YARN

  1. View the queue resource usage of a cluster. If the queue resources are often underutilized, you can scale in the core node group of the cluster.

    Core_p1

  2. On the Nodes page, sort the nodes by Nodes Address and identify the node that has largest amount of available memory resources. Then, remove the node.

    core_p2

Important

If your cluster is a Hadoop cluster, take note of the following items before you scale in the core node group of the Hadoop cluster:

  • If your cluster is a non-high-availability cluster, you cannot remove the emr-worker-1 or emr-worker-2 node from the cluster.

  • If your cluster is a high availability cluster but the number of master nodes in the cluster is 2, you cannot remove the emr-worker-1 node from the cluster.

Step 2: View the components deployed on a node

Specific service components are deployed on the nodes of a core node group to run compute tasks. The components store the data of specific services. Before you can scale in a core node group of a cluster, you must unpublish the components of specific services that are deployed in the cluster. You can view the components that are deployed on a node on the Nodes tab in the EMR console.

core_p3

Step 3: Unpublish the components of a node

If the following components are deployed on the node that you want to remove, you must unpublish the components before you remove the nodes. If you remove the node on which following components are deployed without unpublishing the components, jobs that run on the node may fail and data security risks may occur.

Unpublish the NodeManager component of the YARN service

  1. Go to the Status tab of the YARN service page.

    1. Log on to the EMR console. In the left-side navigation pane, click EMR on ECS.

    2. In the top navigation bar, select a region and a resource group based on your business requirements.

    3. On the EMR on ECS page, find the desired cluster and click Services in the Actions column.

    4. On the Services tab, find the YARN service and click Status.

  2. Unpublish the NodeManager component that is deployed on a desired node.

    1. In the Components section of the Status tab, find NodeManager, move the pointer over the core_p0 icon in the Actions column, and then select Unpublish.

    2. In the dialog box that appears, set the Execution Scope parameter to Specific Machine, select the desired node, configure the Execution Reason parameter, and then click OK.

    3. In the Confirm message, click OK.

  3. In the upper-right corner of the Services tab, click Operation History and view the operation progress.

Unpublish the DataNode component of the HDFS service

  1. Log on to the master node of your cluster in SSH mode. For more information, see Log on to a cluster.

  2. Switch to the hdfs user and view the number of NameNodes.

    sudo su - hdfs
    hdfs haadmin -getAllServiceState
  3. Log on to the nodes on which NameNode is deployed in SSH mode and add the nodes on which you want to unpublish the DataNode component to the dfs.exclude file. We recommend that you add only one node at a time.

    • Hadoop clusters

      touch /etc/ecm/hadoop-conf/dfs.exclude
      vim /etc/ecm/hadoop-conf/dfs.exclude

      Enter o after the vim command, start a new line, and then enter the hostname of the DataNode component that you want to unpublish.

      emr-worker-3.cluster-xxxxx
      emr-worker-4.cluster-xxxxx
    • Non-Hadoop clusters

      touch /etc/taihao-apps/hdfs-conf/dfs.exclude
      vim /etc/taihao-apps/hdfs-conf/dfs.exclude

      Enter o after the vim command, start a new line, and then enter the hostname of the DataNode component that you want to unpublish.

      core-1-3.c-0894dxxxxxxxxx
      core-1-4.c-0894dxxxxxxxxx
  4. Switch to the hdfs user on a node on which NameNode is deployed and run the following commands. Then, HDFS automatically starts to unpublish the DataNode component.

    sudo su - hdfs
    hdfs dfsadmin -refreshNodes
  5. Confirm the result.

    Run the following command to check whether the DataNode component is unpublished:

    hadoop dfsadmin -report

    If the status is Decommissioned, the data of the DataNode component is migrated to other nodes and the DataNode component is unpublished.

Unpublish the Backend component of the StarRocks service

  1. Log on to your cluster and use a client to access the cluster. For more information, see Getting started.

  2. Run the following command to unpublish the Backend component:

    ALTER SYSTEM DECOMMISSION backend "be_ip:be_heartbeat_service_port";

    Configure the following parameters in the preceding command based on your business requirements:

    • be_ip: You can find the desired node and obtain the internal IP address of the node on the Nodes tab.

    • be_heartbeat_service_port: The default value is 9050. You can run the show backends command to obtain the service port.

    If the unpublishment speed of the Backend component is slow, you can run the DROP command to forcefully unpublish the component.

    Important

    If you run the DROP command to unpublish the Backend component, make sure that the system contains three replicas.

    ALTER SYSTEM DROP backend "be_ip:be_heartbeat_service_port";
  3. Run the following command to check the status of the Backend component:

    show backends;

    Starrocks1

    If the value in the SystemDecommissioned column is true, the Backend component is being unpublished. If the value in the TabletNum column is 0, the system cleans up the metadata.

    If the Backend component is not displayed in the preceding figure, the component is unpublished.

Unpublish the HRegionServer component of the HBase service

  1. Go to the Status tab of the HBase service page.

    1. Log on to the EMR console. In the left-side navigation pane, click EMR on ECS.

    2. In the top navigation bar, select a region and a resource group based on your business requirements.

    3. On the EMR on ECS page, find the desired cluster and click Services in the Actions column.

    4. On the Services tab, find the HBase service and click Status.

  2. Unpublish the HRegionServer component that is deployed on a desired node.

    1. In the Components section of the Status tab, find HRegionServer and click Stop in the Actions column.

    2. In the dialog box that appears, set the Execution Scope parameter to Specific Machine, select the desired node, configure the Execution Reason parameter, and then click OK.

    3. In the Confirm message, click OK.

  3. In the upper-right corner of the Services tab, click Operation History and view the operation progress.

Unpublish the DataNode component of the HBase-HDFS service

  1. Log on to the master node of your cluster in SSH mode. For more information, see Log on to a cluster.

  2. Run the following commands to switch to the hdfs user and configure the environment variable:

    sudo su - hdfs
    export HADOOP_CONF_DIR=/etc/taihao-apps/hdfs-conf/namenode
  3. Run the following command to view information about the NameNode:

    hdfs dfsadmin -report
  4. Log on to the nodes on which NameNode is deployed in SSH mode and add the nodes on which you want to unpublish the DataNode component to the dfs.exclude file. We recommend that you add only one node at a time.

    touch /etc/taihao-apps/hdfs-conf/dfs.exclude
    vim /etc/taihao-apps/hdfs-conf/dfs.exclude

    Enter o after the vim command, start a new line, and then enter the hostname of the DataNode component that you want to unpublish.

    core-1-3.c-0894dxxxxxxxxx
    core-1-4.c-0894dxxxxxxxxx
  5. Switch to the hdfs user on a node on which NameNode is deployed and run the following commands. Then, HDFS automatically starts to unpublish the DataNode component.

    sudo su - hdfs
    export HADOOP_CONF_DIR=/etc/taihao-apps/hdfs-conf/namenode
    hdfs dfsadmin -refreshNodes
  6. Confirm the result.

    Run the following command to check whether the DataNode component is unpublished:

    hadoop dfsadmin -report

    If the status is Decommissioned, the data of the DataNode component is migrated to other nodes and the DataNode component is unpublished.

    Unpublish the JindoStorageService component of the SmartData service (Hadoop clusters)

    1. Go to the Status tab of the SmartData service page.

      1. Log on to the EMR console. In the left-side navigation pane, click EMR on ECS.

      2. In the top navigation bar, select a region and a resource group based on your business requirements.

      3. On the EMR on ECS page, find the desired cluster and click Services in the Actions column.

      4. On the Services tab, find the SmartData service and click Status.

    2. Unpublish the JindoStorageService component that is deployed on a desired node.

      1. In the Components section of the Status tab, find JindoStorageService, move the pointer over the core_p0 icon in the Actions column, and then select Unpublish.

      2. In the dialog box that appears, set the Execution Scope parameter to Specific Machine, select the desired node, configure the Execution Reason parameter, and then click OK.

      3. In the Confirm message, click OK.

    3. In the upper-right corner of the Services tab, click Operation History and view the operation progress.

Step 4: Remove a node

Important

To remove nodes in a node group of your EMR cluster, you must go to the ECS console to release the ECS instance that corresponds to the nodes. If you want to perform this operation as a RAM user, you must have the required ECS permissions. We recommend that you attach the AliyunECSFullAccess policy to the RAM user.

  1. Go to the EMR on ECS page.

    1. Log on to the EMR console. In the left-side navigation pane, click EMR on ECS.

    2. In the top navigation bar, select a region and a resource group based on your business requirements.

    3. On the EMR on ECS page, find the desired cluster and click Nodes in the Actions column.

  2. On the Nodes tab, find the node that you want to remove and click the ID of the node.

    The Instances Details tab of the ECS console appears.

  3. Release the instance in the ECS console. For more information, see Release an instance.

References

  • For information about how to scale in a task node group that contains pay-as-you-go or preemptible instances, see Scale in a cluster.

  • If the computing resources of a cluster are insufficient, you can scale out the core and task node groups. For more information, see Scale out an EMR cluster.

  • If you want the system to automatically adjust the computing resources of a cluster based on your business requirements, you can configure managed auto scaling rules or custom auto scaling rules for the node groups in the cluster. For more information, see Auto scaling.