All Products
Search
Document Center

E-MapReduce:Kudu FAQ

Last Updated:Feb 27, 2026

This topic provides answers to some frequently asked questions about Kudu.

General

Where are Kudu log files stored?

Kudu log files are in /mnt/disk1/log/kudu.

What partitioning methods does Kudu support?

Kudu supports range partitioning and hash partitioning. The two methods can be combined. For details, see Apache Kudu Schema Design.

How do I access the Kudu web UI?

Kudu is not integrated with Knox. Create an SSH tunnel to access the web UI instead. For instructions, see Create an SSH tunnel to access web UIs of open source components.

Where can I find the Kudu community FAQ?

See the Apache Kudu Troubleshooting page.

Startup errors

NonRecoverableException when connecting the Kudu client

The following error indicates a mismatch between the number of master nodes configured on the client and the number expected by the cluster:

org.apache.kudu.client.NonRecoverableException: Could not connect to a leader master. Client configured with 1 master(s) (192.168.0.10:7051) but cluster indicates it expects 3 master(s) (192.168.0.36:7051,192.168.0.11:7051,192.168.0.10:7051)

Deploy all required master nodes and connect the Kudu client to the primary master node.

Kudu fails to start due to Bigboot monitor defect

A defect in Bigboot V3.5.0 prevents Kudu from restarting after a crash. The Bigboot monitor fails to delete the service information from its database, causing subsequent restart attempts to fail.

Stop Kudu and then start it again by running the following commands directly on the machine.

Run the following commands on a core or task node. If you run the following commands on a master node, replace kudu-tserver in the commands with kudu-master.

/usr/lib/b2monitor-current/bin/monictrl -stop kudu-tserver
/usr/lib/b2monitor-current/bin/monictrl -start kudu-tserver
Note Run these commands on the machine itself. The EMR console may not be able to perform the stop operation because the service is already terminated.

Clock synchronization error prevents Kudu from starting

This error appears when ntpd on the machine cannot connect to the configured NTP server:

Service unavailable: RunTabletServer() failed: Cannot initialize clock: timed out waiting for clock synchronisation: Error reading clock. Clock considered unsynchronized

The logs may also include output similar to:

E1010 10:37:54.165313 29920 system_ntp.cc:104] /sbin/ntptime
------------------------------------------
stdout:
ntp_gettime() returns code 5 (ERROR)
  time e6ee0402.2a452c4c  Mon, Oct 10 2022 10:37:54.165, (.165118697),
  maximum error 16000000 us, estimated error 16000000 us, TAI offset 0
ntp_adjtime() returns code 5 (ERROR)
  modes 0x0 (),
  offset 0.000 us, frequency 187.830 ppm, interval 1 s,
  maximum error 16000000 us, estimated error 16000000 us,
  status 0x2041 (PLL,UNSYNC,NANO),
  time constant 6, precision 0.001 us, tolerance 500 ppm,

Restart the server and try again.

Runtime errors

Network error: unable to resolve hostname

Bad status: Network error: Could not obtain a remote proxy to the peer.: unable to resolve address for <hostname>: Name or service not known

This error occurs when a hostname cannot be resolved to an IP address. The Raft peer of the Kudu tablet cannot identify its peer Raft servers, so it terminates the network connection.

Solution 1: Manually add the hostname-to-IP mapping in /etc/hosts.

Solution 2: If the host behind the hostname has been released, add a mapping between the hostname and any IP address to /etc/hosts. The IP address does not need to be reachable. Once the mapping is in place, the Kudu tablet server replicates data from the unavailable Raft server to a new Raft server in the Raft group.

Filesystem layout integrity error

Bad status: I/O error: Failed to load Fs layout: could not verify integrity of files: <directory>, <number> data directories provided, but expected <number>

The number of disks specified by -fs_data_dirs does not match the metadata recorded by -fs_metadata_dir. Update -fs_data_dirs so the disk count matches what is recorded in -fs_metadata_dir.

Thread creation failure (pthread_create error 11)

pthread_create failed: Resource temporarily unavailable (error 11)

Check the following causes in order.

Insufficient process limits

Check the current limit for max user processes:

ulimit -a

If the value is too low, increase it by modifying /etc/security/limits.conf or by creating /etc/security/limits.d/kudu.conf.

Kudu client V0.8 thread leak in hybrid deployments

In hybrid deployments, Spark executors may leak threads when using Kudu client V0.8. This is a known issue documented in KUDU-1453. Upgrade to Kudu client V0.9 to resolve the issue.

Trino shutdown thread leak

When Trino exits, the shutdown hook thread blocks on the take method of BlockingQueue waiting for an element. This thread cannot be interrupted. The EMR control continuously sends the SIGTERM signal, creating new SIGTERM Handler threads until threads are exhausted.

Fix the issue on the Trino side, or terminate the process with kill -9.

Jindo SDK thread pool leak

Spark uses the JindoOssCommitter class for write jobs. This class creates a JindoOssMagicCommitter object that generates a thread pool named oss-committer-pool. The thread pool is not static and is never shut down. As new JindoOssMagicCommitter objects are created, thread pools accumulate without being released. This is especially likely with Spark Streaming or Structured Streaming workloads.

Add the following Spark parameters to work around the issue:

spark.sql.hive.outputCommitterClass=org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter
spark.sql.sources.outputCommitterClass=org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter

Find the process with the most threads

Use the following threads_monitor.sh script to identify which process consumes the most threads:

#!/bin/bash

total_threads=0
max_pid=-1
max_threads=-1

for tid in `ls /proc`
do
  if [[ $tid != *self && -f /proc/$tid/status ]]; then
    num_threads=`cat /proc/$tid/status | grep Threads | awk '{print $NF}'`
    ((total_threads+=num_threads))
    if [[ ${max_pid} -eq -1 || ${max_threads} -lt ${num_threads} ]]; then
      max_pid=${tid}
      max_threads=${num_threads}
    fi
#    echo "Thread ${pid}: ${num_threads}"
  fi
done

echo "Total threads: ${total_threads}"
echo "Max threads: ${max_threads}, pid is ${max_pid}"
ps -ef | grep ${max_pid} | grep -v grep

Soft memory limit exceeded

Rejecting Write request: Soft memory limit exceeded

The volume of data being written exceeds the soft memory limit. You can perform the following operations:

  1. Configure the memory_limit_hard_bytes parameter to increase the memory size. The default value is 0, which indicates that the maximum memory usage is automatically set by the system. You can change the value to -1. This indicates that no limit is imposed on the memory usage.

  2. Configure the memory_limit_soft_percentage parameter to adjust the percentage of available memory. The default value is 80.