All Products
Search
Document Center

E-MapReduce:Fix a single abnormal JournalNode

Last Updated:Mar 26, 2026

If exactly one JournalNode in your cluster is abnormal, you can restore it by syncing data from a healthy JournalNode on another node. This procedure applies only when a single JournalNode is abnormal.

How it works

HDFS high availability relies on a quorum of JournalNodes to replicate edit logs between the active and standby NameNodes. Because edit logs are written to a majority of JournalNodes, a single healthy node retains a complete copy of the data needed to restore the abnormal node.

Prerequisites

Before you begin, make sure you have:

  • At least one healthy JournalNode in the cluster

  • SSH access to both the healthy node and the abnormal node

  • Access to the E-MapReduce (EMR) console to stop and start HDFS components

Applicability

Condition This procedure applies
Exactly one JournalNode is abnormal Yes
Two or more JournalNodes are abnormal No

Restore the abnormal JournalNode

Step 1: Identify a healthy JournalNode

Check the status of all JournalNodes on the web user interface (UI) of HDFS. For more information, see Web UIs of HDFS components.

Confirm that at least one JournalNode shows a healthy status before proceeding.

Step 2: Package the restore data from the healthy node

Log on to the node where the healthy JournalNode resides. For steps, see Log on to a cluster. Select a header or master node when possible.

  1. Switch to the hdfs user.

    su hdfs
  2. Go to the JournalNode data directory.

    cd /mnt/disk1/hdfs/journal/emr-cluster/
  3. Package the restore files, excluding edit logs.

    tar --exclude='edits*' -zcvf /tmp/jn-current.tar.gz current

    The expected output is:

    current/
    current/last-writer-epoch
    current/VERSION
    current/last-promised-epoch
    current/paxos/
    current/committed-txid

Step 3: Copy the package to the abnormal node

Still on the healthy node, switch to the emr-user to run the copy.

  1. Switch to the emr-user.

    Note

    If emr-user does not exist — for example, in EMR V3.41.0, EMR V5.7.0, or earlier minor versions — switch to the hadoop user instead.

    su emr-user
    su hadoop
  2. Copy the package to the abnormal node.

    scp /tmp/jn-current.tar.gz $unhealthy-journal-node:/tmp/

    Replace $unhealthy-journal-node with the hostname of the node where the abnormal JournalNode resides.

Step 4: Stop the abnormal JournalNode and restore its data

  1. In the EMR console, stop the JournalNode on the HDFS node that is abnormal.

  2. Log on to the abnormal node. For steps, see Log on to a cluster.

  3. Switch to the hdfs user.

    su hdfs
  4. Go to the JournalNode data directory and back up the existing data.

    Important

    Do not skip this backup. If the restore fails, you can recover from current.bak.

    cd /mnt/disk1/hdfs/journal/emr-cluster/
    mv current current.bak
  5. Extract the package to restore the JournalNode data.

    tar -xvf /tmp/jn-current.tar.gz

Step 5: Start the restored JournalNode

In the EMR console, start the JournalNode on the HDFS node.

After it starts, check the logs for errors. For more information, see HDFS service logs.

Step 6: Verify the restoration

Open the HDFS web UI and check the JournalNode status. If data can be written to the JournalNode, the restoration is successful. For more information, see Web UIs of HDFS components.

Once you confirm the JournalNode is healthy, remove the backup directory.

rm -rf /mnt/disk1/hdfs/journal/emr-cluster/current.bak