All Products
Search
Document Center

E-MapReduce:FAQ

Last Updated:Mar 26, 2026

Fewer logs are written to Hive than generated

Hadoop Distributed File System (HDFS) Sink rolls a file to HDFS after every 100 events by default. At high log volumes, this means Hive sees incomplete data until the next roll occurs.

Set hdfs.batchSize in the EMR console to increase how many events are written before each file roll. For instructions, see Add parameters.

DeadLock error when terminating the Flume process

The exit method can deadlock when it waits for threads that are blocked on each other. Run kill -9 to force-terminate the process instead.

File Channel exceptions after force-killing the Flume process

The two exceptions below occur when kill -9 interrupts File Channel without a clean shutdown.

Use kill -9 only when no other option is available. Force-killing the Flume process can leave lock files or corrupt data files.

Cannot lock data/checkpoints/xxx. The directory is already locked.

kill -9 leaves an in_use.lock file that prevents Flume from acquiring the directory lock on restart. Delete in_use.lock before restarting Flume.

CorruptEventException: Could not parse event from data file.

kill -9 can corrupt File Channel data files mid-write. Delete the checkpoints and data directories before restarting Flume.