This topic describes how to troubleshoot issues related to the Hive service.

Locate exceptions

If an exception such as a performance exception occurs on the Hive client, you can perform the following steps to locate the exception:
  • Check the CPU, memory, network, and disks of the cluster in which the Hive service is deployed.
  • Check whether the components of the Hive service run as expected by performing the following steps:
    1. Check whether an exception occurs in the HiveMetaStore and HiveServer2 components of Hive. If an exception occurs, troubleshoot the issue based on the corresponding metrics. For example, if GC-related metrics indicate that the memory usage is too high, you must adjust the memory size. For more information, see Modify the memory parameters of the Hive service.
    2. Go to the Monitoring tab of the cluster, and click the Metric Monitoring tab. On the Metric Monitoring tab, check the key metrics of the HiveMetaStore and HiveServer2 components and determine whether you need to modify the related parameters. For more information, see Check items and key metrics of Hive.
    3. Check the logs of the HiveMetaStore or HiveServer2 component. In most cases, the logs of the components are stored in the /mnt/disk1/log/hive/ directory. You can check the .log, .err, .out, and GC logs of the HiveMetaStore or HiveServer2 component to identify the cause of the exception that occurs in the component.

Issues related to HiveMetaStore databases

Issue 1: The error "Host xxxx is blocked because of many connection errors; unblock with 'mysqladmin flush-hosts'" occurs

Cause: Errors frequently occur when the client connects to the database. As a result, the database rejects the connection request that is sent form the client.

Solutions:
  • Solution 1: Increase the value of the max_connect_errors parameter. The modification immediately takes effect.
    Important The max_connect_errors parameter is used to prevent a client from brute-force cracking the database password. We recommend that you do not set this parameter to an excessively large value.
    1. Log on to the database and run the following command to view the value of the max_connect_errors parameter:
      show global variables like '%max_connect_errors%'
    2. Run the following command to set the max_connect_errors parameter to a larger value:
      set global max_connect_errors=[A larger value]
  • Solution 2: Run the following command to clear the cached data of the hosts on which an exception occurs. You can also log on to the database and run the flush hosts command to clear the cached data.
    mysqladmin -u root -p flush-hosts

Issue 2: The error "Metastore Connection Driver : com.mysql.jdbc.Driver Metastore connection User: xxx ,stderr=org.apache.hadoop.hive.metastore.HiveMetaException: Failed to get schema version. Underlying cause: com.mysql.jdbc.exceptions.jdbc4.CommunicationsException : Communications link failure" occurs

Cause: The Hive metastore database is not initialized on the self-managed ApsaraDB for RDS instance.

Solution: Initialize the Hive metastore database. For more information, see the Initialize the metastore service section in the Configure an independent ApsaraDB RDS for MySQL database topic.

HiveMetaStore-related issues

Issue: The error "org.apache.hadoop.hive.ql.metadata.HiveException: MetaException(message:Could not connect to meta store using any of the URIs provided" occurs

  • Cause 1: The HiveMetaStore component is interrupted or the load of the HiveMetaStore component is heavy. For example, GCs last for a long period of time.
    Solutions:
    • Check the GC logs of the HiveMetaStore component and increase the memory size of the HiveMetaStore component. For more information, see Modify the memory parameters of the Hive service.
    • Check the logs of the HiveMetaStore component. If java.lang.OutOfMemoryError appears in the logs, increase the memory size of the HiveMetaStore component. For more information, see Modify the memory parameters of the Hive service.
    • Check whether the hive.metastore.transactional.event.listeners and hive.metastore.event.db.listener parameters are configured. Some listeners may cause a rapid increase in the memory usage of the HiveMetaStore component. If listeners are configured, remove the settings of the listeners and restart the HiveMetaStore component.
    • If a large number of client requests or concurrent requests are sent to the HiveMetaStore component, increase the memory size of the HiveMetaStore component. The default memory size is 500 MiB. For more information, see Modify the memory parameters of the Hive service.
    • Check the error message in the logs of the HiveMetaStore component. If the HiveMetaStore component cannot be started, check whether the configurations of the database link are correct.
  • Cause 2: The client is not connected to the HiveMetaStore component. This is a common cause of the issue for the client that is deployed on a self-managed Elastic Compute Service (ECS) instance.

    Solution: Check whether the client is connected to the HiveMetaStore component. If the client is not connected to the HiveMetaStore component, establish a connection between the client and the HiveMetaStore component.

HiveServer2-related issues

Issue 1: HiveServer2 unexpectedly restarts and runs as expected after a period of time

Cause: An exception may occur in HiveServer2. Check whether the number of executed SQL statements increases or the service workloads are heavy. If the service workloads are heavy, adjust the memory size of the HiveServer2 component.

Solution: Check the environment of the machine on which the HiveServer2 component is deployed and check the component by following the instructions that are described in Locate exceptions.

Issue 2: The error "Unexpected end of file when reading from HS2 server. The root cause might be too many concurrent connections" occurs

Cause: HiveServer2 is overloaded.

Solution: Check whether an application, such as a Flink job, continuously calls the Hive CLI. If the application continuously calls the Hive CLI, stop the application. If no abnormal application is found, adjust the memory size and modify the hive.server2.thrift.max.worker.threads parameter. For more information about how to adjust the memory size, see Modify the memory parameters of the Hive service.

Issue 3: The error "Could not connect to any of [xxx, 10000]" occurs

Cause: An exception may exist in HiveServer2.

Solution: Check the environment of the machine on which the HiveServer2 component is deployed or check the component by following the instructions that are described in Locate exceptions.

Issue 4: The error "java.lang.OutOfMemoryError: Compressed class space" occurs

Cause: The compressed class space of the HiveServer component is insufficient.

Solution: Increase the compressed class space of the HiveServer component by performing the following operations: In the E-MapReduce (EMR) console, go to the Configure tab of the Hive service. On the Configure tab, click the hive-env.sh tab and increase the value of the XX:CompressedClassSpaceSize parameter in the hive_server2_opts configuration item. Retain the default values for other parameters in the configuration item. For example, you can set the XX:CompressedClassSpaceSize parameter to 512m.