Use this guide to diagnose and resolve common issues with the Hive service on E-MapReduce (EMR), including problems with HiveMetaStore and HiveServer2.
Locate exceptions
When a performance or connectivity issue occurs on the Hive client, work through the following steps to identify the root cause.
Step 1: Check cluster resources.
Inspect CPU, memory, network, and disk usage on the cluster where Hive is deployed.
Step 2: Check component health.
-
Check whether an exception has occurred in the HiveMetaStore or HiveServer2 component. If a garbage collection (GC)-related metric shows high memory usage, adjust the memory size. For more information, see Modify the memory parameters of the Hive service.
-
On the Monitoring tab of the cluster, open the Metric Monitoring tab and review key metrics for HiveMetaStore and HiveServer2 to determine whether parameter changes are needed. For more information, see Check items and key metrics of Hive.
-
Check the component logs. In most cases, logs for each component are stored at the following paths: For each component, review these log types:
Component Log path HiveMetaStore /mnt/disk1/log/hive/HiveServer2 /mnt/disk1/log/hive/Log type Description .logMain application log .errStandard error output .outStandard output GC logs Garbage collection activity
Issues related to HiveMetaStore databases
Issue 1: Host xxxx is blocked because of many connection errors; unblock with 'mysqladmin flush-hosts'
Cause: Repeated connection failures caused the database to block the client.
Solution 1: Increase max_connect_errors
max_connect_errors protects against brute-force password attacks. Do not set it to an excessively large value.
The change takes effect immediately without a restart.
-
Log in to the database and check the current value:
show global variables like '%max_connect_errors%' -
Set the parameter to a larger value:
set global max_connect_errors=[A larger value]
Solution 2: Flush the blocked hosts
Run the following command to clear cached connection error data for the blocked hosts. Alternatively, log in to the database and run the flush hosts command.
mysqladmin -u root -p flush-hosts
Issue 2: Metastore Connection Driver : com.mysql.jdbc.Driver Metastore connection User: xxx ,stderr=org.apache.hadoop.hive.metastore.HiveMetaException: Failed to get schema version. Underlying cause: com.mysql.jdbc.exceptions.jdbc4.CommunicationsException : Communications link failure
Cause: The Hive metastore database has not been initialized on the self-managed ApsaraDB RDS for MySQL instance.
Solution: Initialize the Hive metastore database. For more information, see the Initialize the metastore service section in the Configure an independent ApsaraDB RDS for MySQL database topic.
HiveMetaStore-related issues
Issue: org.apache.hadoop.hive.ql.metadata.HiveException: MetaException(message:Could not connect to meta store using any of the URIs provided
Cause 1: HiveMetaStore is overloaded or interrupted.
GC pauses lasting for a long time are a common indicator.
Solutions:
-
Check the GC logs of HiveMetaStore. If GC activity is high, increase the memory size. For more information, see Modify the memory parameters of the Hive service.
-
Check the HiveMetaStore logs. If
java.lang.OutOfMemoryErrorappears, increase the memory size. For more information, see Modify the memory parameters of the Hive service. -
Check whether
hive.metastore.transactional.event.listenersorhive.metastore.event.db.listeneris configured. Some listeners cause rapid memory growth in HiveMetaStore. If listeners are configured, remove their settings and restart HiveMetaStore. -
If a large number of client or concurrent requests are sent to HiveMetaStore, increase its memory size. The default memory size is 500 MiB. For more information, see Modify the memory parameters of the Hive service.
-
If HiveMetaStore fails to start, check the error messages in its logs and verify the database connection configuration.
Cause 2: The client is not connected to HiveMetaStore.
This is common when the client runs on a self-managed Elastic Compute Service (ECS) instance.
Solution: Verify that the client can reach HiveMetaStore. If not, establish the connection.
HiveServer2-related issues
Issue 1: HiveServer2 unexpectedly restarts and recovers after a period of time
Cause: An exception in HiveServer2 triggered the restart. Check whether the number of executed SQL statements increases or the service workloads are heavy.
Solution: Check resource usage on the machine running HiveServer2. Follow the steps in the Locate exceptions section to identify the root cause. If workloads are heavy, increase the HiveServer2 memory size.
Issue 2: Unexpected end of file when reading from HS2 server. The root cause might be too many concurrent connections
Cause: HiveServer2 is overloaded due to too many concurrent connections.
Solution:
-
Check whether an application (for example, a Flink job) is continuously calling the Hive CLI. If so, stop the application.
-
If no abnormal application is found, increase the HiveServer2 memory size and raise the value of
hive.server2.thrift.max.worker.threads. For more information about adjusting memory, see Modify the memory parameters of the Hive service.
Issue 3: Could not connect to any of [xxx, 10000]
Cause: An exception has occurred in HiveServer2.
Solution: Check resource usage on the machine running HiveServer2. Follow the steps in the Locate exceptions section to identify and resolve the issue.
Issue 4: java.lang.OutOfMemoryError: Compressed class space
Cause: The compressed class space allocated to HiveServer2 is insufficient.
Solution: Increase the compressed class space for HiveServer2.
-
In the EMR console, go to the Configure tab of the Hive service page.
-
Click the hive-env.sh tab.
-
In the
hive_server2_optsconfiguration item, increase the value ofXX:CompressedClassSpaceSize. Keep the default values for all other parameters in the configuration item. For example, setXX:CompressedClassSpaceSizeto512m.