All Products
Search
Document Center

E-MapReduce:Manually scale in a node group

Last Updated:Jun 20, 2026

If your cluster load remains low for an extended period and many cluster resources are idle, consider scaling in the Core or Task node group to avoid wasting resources. For pay-as-you-go Task node groups, you can perform this operation in the console. For other types—including pay-as-you-go Core node groups, subscription Task node groups, and subscription Core node groups—follow the procedure described in this topic.

Limits

  • If the number of Core nodes in your cluster equals the HDFS replication factor, do not scale in Core nodes to prevent data loss.

  • If your cluster is a legacy Hadoop high availability (HA) cluster with two Master nodes, do not decommission the emr-worker-1 node (in legacy HA clusters, ZooKeeper runs on work-1).

Important notes

  • All operations in this topic are irreversible. Once you start decommissioning, you cannot restore the original state.

  • This topic demonstrates best practices. Evaluate your cluster’s actual conditions carefully before proceeding to avoid job scheduling failures and data security risks.

How to choose nodes to decommission

Scaling in a node group works by decommissioning nodes within that group. Choose nodes based on your cluster service load. This topic provides two methods to check resource usage and identify candidates for decommissioning.

Method 1: EMR console Monitoring and Diagnostics

  1. On the Metric Monitoring page, check the AvailableVCores metric in the YARN-Queues dashboard. If AvailableVCores stays high for a long time, the queue has many idle vCores, and you can consider scaling in the Core or Task node group.

  2. On the Metric Monitoring page, check the AvailableGB metric in the YARN-NodeManagers dashboard. If a node’s AvailableGB remains high for a long time, it has ample available memory, and you can consider releasing that node.

Note

You can use other metrics in combination with your business requirements as criteria.

Method 2: YARN web UI

  1. Check queue resource usage. If queues consistently use little capacity, consider scaling in the Core or Task node group.

    On the YARN ResourceManager Application Queues page, view memory and vCores allocated and used for each partition (such as DEFAULT_PARTITION or batch). Use metrics like Capacity, Used, and Max Capacity in the legend to determine if queue resources are idle.

  2. On the Nodes page, sort by Nodes Address to quickly identify nodes with the most available memory. Consider releasing those nodes.

Important

If your cluster is a legacy Hadoop cluster, take special care in the following cases:

  • If your cluster is non-HA, do not decommission emr-worker-1 or emr-worker-2.

  • If your cluster is HA but has only two Master nodes, do not decommission emr-worker-1.

Step 1: Check services on nodes to decommission

Before decommissioning nodes to scale in a Core or Task node group, first decommission the relevant component services on those nodes. Then release the node resources. You can view deployed components on the Nodes page in the console.

On the Node management page, expand the target node group, find the target node, and click the component count icon (for example, 12 components) on the right side of the node row. A pop-up layer shows all deployed components and their running status.

Step 2: Decommission node component services

If the node you plan to decommission runs any of the following services, decommission those services before releasing the node. Otherwise, you risk job scheduling failures and data security issues.

Decommission YARN NodeManager

  1. Go to the YARN service status page.

    1. Log on to the E-MapReduce console.

    2. In the top menu bar, select the region and resource group as needed.

    3. On the EMR on ECS page, click Services in the Actions column of the target cluster.

    4. On the Services page, click Status in the YARN service section.

  2. Decommission the NodeManager on the target node.

    1. In the Components, click core_p0 > Unpublish in the Actions column for NodeManager.

    2. In the dialog box, select Execution Scope > Specific Machine, enter an Execution Reason, and click OK.

    3. In the confirmation dialog box, click OK.

  3. Click Operation History in the upper-right corner to track progress.

Decommission HDFS DataNode

  1. Log on to a Master node using SSH. For details, see Log on to a cluster.

  2. Switch to the hdfs user and check the number of active NameNodes.

    sudo su - hdfs
    hdfs haadmin -getAllServiceState
  3. SSH into each NameNode host and edit the dfs.exclude file to add the hostname of the node you want to decommission. Add only one node at a time.

    • Legacy Hadoop clusters

      touch /etc/ecm/hadoop-conf/dfs.exclude
      vim /etc/ecm/hadoop-conf/dfs.exclude

      In vim, press o to start a new line and enter the hostname of the DataNode to decommission.

      emr-worker-3.cluster-xxxxx
      emr-worker-4.cluster-xxxxx
    • Non-legacy Hadoop clusters

      touch /etc/taihao-apps/hdfs-conf/dfs.exclude
      vim /etc/taihao-apps/hdfs-conf/dfs.exclude

      In vim, press o to start a new line and enter the hostname of the DataNode to decommission.

      core-1-3.c-0894dxxxxxxxxx
      core-1-4.c-0894dxxxxxxxxx
  4. On any NameNode host, switch to the hdfs user and run the refresh command. HDFS automatically starts decommissioning.

    sudo su - hdfs
    hdfs dfsadmin -refreshNodes
  5. Confirm the decommissioning result.

    Run the following command to check if decommissioning is complete.

    hadoop dfsadmin -report

    When the Status for the specified node shows as Decommissioned, its data has migrated to other nodes, and decommissioning is complete.

Decommission StarRocks

  1. Log on to the cluster and connect using a client. For details, see Quick Start.

  2. Run the following command to decommission a BE using the DECOMMISSION method.

    ALTER SYSTEM DECOMMISSION backend "be_ip:be_heartbeat_service_port";

    Replace the following parameters based on your cluster:

    • be_ip: Find the internal IP address of the BE to scale in on the Nodes page.

    • be_heartbeat_service_port: The default is 9050. You can verify it using the show backends command.

    If decommissioning is slow, you can force decommissioning using the DROP method.

    Important

    If you use the DROP method, ensure your system maintains triplicate data.

    ALTER SYSTEM DROP backend "be_ip:be_heartbeat_service_port";
  3. Run the following command to monitor BE status.

    show backends;
    MySQL [(none)]> show backends;
    +----------+-----------------+------------+---------------+--------+----------+----------+---------------------+---------------------+-------+----------------------+------------------------+-----------+------------------+---------------+---------------+---------+---------+----------+--------+---------------+----------------------------------------------------+-------------------+-------------+----------+
    | BackendId | Cluster         | IP         | HeartbeatPort | BePort | HttpPort | BrpcPort | LastStartTime       | LastHeartbeat       | Alive | SystemDecommissioned | ClusterDecommissioned | TabletNum | DataUsedCapacity | AvailCapacity | TotalCapacity | UsedPct | MaxDiskUsedPct | ErrMsg | Version       | Status                                             | DataTotalCapacity | DataUsedPct | CpuCores |
    +----------+-----------------+------------+---------------+--------+----------+----------+---------------------+---------------------+-------+----------------------+------------------------+-----------+------------------+---------------+---------------+---------+---------+----------+--------+---------------+----------------------------------------------------+-------------------+-------------+----------+
    | 10049    | default_cluster | 192.16xxx  | 9050          | 9060   | 18040    | 8060     | 2022-11-29 15:47:28 | 2022-11-29 16:09:48 | true  | true                 | false                 | 0         | .000             | 312.400 GB    | 312.978 GB    | 0.18 %  | 0.19 %         |        | 2.3.2-dbc89ae | {"xxxcessReportTabletsTime":"2022-11-29 16:09:28"} | 312.400 GB        | 0.00 %      | 4        |
    | 10002    | default_cluster | 192.16xxx  | 9050          | 9060   | 18040    | 8060     | 2022-11-29 15:29:07 | 2022-11-29 16:09:48 | true  | false                | false                 | 10        | .000             | 312.399 GB    | 312.978 GB    | 0.19 %  | 0.19 %         |        | 2.3.2-dbc89ae | {"xxxcessReportTabletsTime":"2022-11-29 16:09:08"} | 312.399 GB        | 0.00 %      | 4        |
    | 10003    | default_cluster | 192.16xxx  | 9050          | 9060   | 18040    | 8060     | 2022-11-29 15:29:07 | 2022-11-29 16:09:48 | true  | false                | false                 | 10        | .000             | 312.399 GB    | 312.978 GB    | 0.19 %  | 0.19 %         |        | 2.3.2-dbc89ae | {"xxxcessReportTabletsTime":"2022-11-29 16:09:07"} | 312.399 GB        | 0.00 %      | 4        |
    | 10004    | default_cluster | 192.16xxx  | 9050          | 9060   | 18040    | 8060     | 2022-11-29 15:29:07 | 2022-11-29 16:09:48 | true  | false                | false                 | 10        | .000             | 312.399 GB    | 312.978 GB    | 0.19 %  | 0.19 %         |        | 2.3.2-dbc89ae | {"xxxcessReportTabletsTime":"2022-11-29 16:09:08"} | 312.399 GB        | 0.00 %      | 4        |
    +----------+-----------------+------------+---------------+--------+----------+----------+---------------------+---------------------+-------+----------------------+------------------------+-----------+------------------+---------------+---------------+---------+---------+----------+--------+---------------+----------------------------------------------------+-------------------+-------------+----------+
    4 rows in set (0.01 sec)

    A node with SystemDecommissioned set to true is undergoing decommissioning. When TabletNum reaches 0, the system cleans up metadata.

    If the BE node no longer appears in the results, decommissioning succeeded.

Decommission HBase HRegionServer

  1. Go to the HBase service status page.

    1. Log on to the E-MapReduce console.

    2. In the top menu bar, select the region and resource group as needed.

    3. On the EMR on ECS page, click Services in the Actions column of the target cluster.

    4. On the Services page, click Status in the HBase service section.

  2. Decommission the HRegionServer on the target node.

    1. In the Components, click STOP in the Actions column for HRegionServer.

    2. In the dialog box, select Execution Scope > Specific Machine, enter an Execution Reason, and click OK.

    3. In the confirmation dialog box, click OK.

  3. Click Operation history in the upper-right corner to track progress.

Decommission HBASE-HDFS DataNode

  1. Log on to a Master node using SSH. For details, see Log on to a cluster.

  2. Run the following commands to switch to the hdfs user and set environment variables.

    sudo su - hdfs
    export HADOOP_CONF_DIR=/etc/taihao-apps/hdfs-conf/namenode
  3. Run the following command to view current NameNode information.

    hdfs dfsadmin -report
  4. SSH into each NameNode host and edit the dfs.exclude file to add the hostname of the node you want to decommission. Add only one node at a time.

    touch /etc/taihao-apps/hdfs-conf/dfs.exclude
    vim /etc/taihao-apps/hdfs-conf/dfs.exclude

    In vim, press o to start a new line and enter the hostname of the DataNode to decommission.

    core-1-3.c-0894dxxxxxxxxx
    core-1-4.c-0894dxxxxxxxxx
  5. On any NameNode host, switch to the hdfs user and run the refresh command. HDFS automatically starts decommissioning.

    sudo su - hdfs
    export HADOOP_CONF_DIR=/etc/taihao-apps/hdfs-conf/namenode
    hdfs dfsadmin -refreshNodes
  6. Confirm the decommissioning result.

    Run the following command to check if decommissioning is complete.

    hadoop dfsadmin -report

    When the Status for the specified node shows as Decommissioned, its data has migrated to other nodes, and decommissioning is complete.

    Decommission SmartData JindoStorageService (legacy Hadoop clusters)

    1. Go to the SmartData service status page.

      1. Log on to the E-MapReduce console.

      2. In the top menu bar, select the region and resource group as needed.

      3. On the EMR on ECS page, click Services in the Actions column of the target cluster.

      4. On the Services page, click Status in the SmartData service section.

    2. Decommission the JindoStorageService on the target node.

      1. In the Components, click core_p0 > Unpublish in the Actions column for JindoStorageService.

      2. In the dialog box, select Execution Scope > Specific Machine, enter an Execution Reason, and click OK.

      3. In the confirmation dialog box, click OK.

    3. Click Operation History in the upper-right corner to track progress.

Step 3: Release decommissioned nodes

Important

You must log on to the ECS console to manage cluster nodes. If you are a Resource Access Management (RAM) user, you need ECS permissions. We recommend granting the AliyunECSFullAccess policy.

  1. Go to the node management page.

    1. Log on to the E-MapReduce console.

    2. In the top menu bar, select the region and resource group as needed.

    3. On the EMR on ECS page, click Nodes in the Actions column of the target cluster.

  2. On the Nodes page, click the ECS ID of the node you want to release.

    You will be redirected to the ECS console.

  3. Release the instance in the ECS console. For details, see Release instances.

References

  • For scaling in pay-as-you-go and spot instance Task node groups, see Scale in a cluster.

  • If your cluster lacks compute resources, scale out the Core or Task node group. For details, see Scale out a cluster.

  • To automatically adjust cluster compute resources based on business needs, configure managed or custom Auto Scaling rules for node groups. For details, see Auto Scaling.