This topic describes how to troubleshoot the "DataXceiver Premature EOF from inputStream" error when you write data to Hadoop Distributed File System (HDFS).

Error message

DataXceiver error processing WRITE_BLOCK operation src: /10.*.*.*:35692 dst: /10.*.*.*:50010 java.io.IOException: Premature EOF from inputStream

Cause

To continuously write new data to HDFS, multiple HDFS data write streams are enabled in jobs. However, the number of files that can be written in HDFS at the same time is limited by a specific parameter on a DataNode. If the limit is exceeded, the "DataXceiver Premature EOF from inputStream" error occurs.

Solution

View logs of the DataNode. For more information, see HDFS service logs. In most cases, the following error message is returned:
java.io.IOException: Xceiver count 4097 exceeds the limit of concurrent xcievers: 4096
at org.apache.hadoop.hdfs.server.datanode.DataXceiverServer.run(DataXceiverServer.java:150)
If the error occurs, you can use the following method to troubleshoot the error.
  1. Log on to the EMR on ECS console.
  2. In the top navigation bar, select the region where your cluster resides and select a resource group based on your business requirements.
  3. Go to the Configure tab of HDFS.
    1. On the EMR on ECS page, find the cluster that you want to manage and click Services in the Actions column.
    2. On the page that appears, click the Configure tab.
  4. Search for the dfs.datanode.max.transfer.threads parameter and appropriately increase the value of the parameter. In most cases, we recommend that you increase the value by a factor of two. For example, you can set the parameter to 8192 or 16384.
    Note The dfs.datanode.max.transfer.threads parameter specifies the maximum number of threads that can be used by the DataNode to process read and write data streams. Default value: 4096.