All Products
Search
Document Center

Use Hive to access HBase

Last Updated: Mar 30, 2021

ApsaraDB for HBase Performance-enhanced Edition allows you to connect to the database service by using Hive. However, Hive does not use the standard operations such as GET or PUT to call ApsaraDB for HBase, but calls the internal classes in ApsaraDB for HBase. Therefore, it is not possible to directly join the alihbase-connector JAR package for compatibility. It is necessary to replace the existing HBase jar package under the hive/lib directory, rather than adding the alihbase-connector JAR file in the hive/lib directory.

hivelib

  1. Delete all jars starting with hbase in the hive/lib. The following figure shows the JAR files to be deleted, which are highlighted in red. Be careful not to delete the hive-hbase-handler-{version}.jar, which is the logical code jar package for Hive to access HBase.

  2. Click here to download the jar package file of the alihbase compatible client. For all jar packages that start with alihbase-, other dependencies will be available. Can not put), all put in the hive/lib directory.

  3. If the hbase dependency was specified in the hive/.hiverc or using the --auxpath parameter when starting Hive, the loaded jar package needs to be replaced with a JAR package starting with a new alihbase.

Retrieve an endpoint

For more information, see Use the Java API to access an enhanced edition cluster

Use the Java API to access an enhanced edition cluster.

Retrieve the username and password

The default username is root and the password is root. Or after the ACL function is turned off in the cluster management page, you do not need to provide the username and password.

Add the IP address of the server where the Hive is deployed to the ApsaraDB for HBase whitelist.

The IP addresses of all Hive machines that access HBase must be added to the whitelist of the HBase cluster. Otherwise, they cannot be accessed.

Configure connection parameters in Hive

There are two ways to configure the parameters for connecting HBase in Hive, one is to directly configure it in the hive-site.xml file. Add the following configurations to this file:

 <configuration> <!-- The endpoint of your ApsaraDB for HBase cluster. You can retrieve the cluster endpoint from the Database Connection page of the ApsaraDB for HBase console. Different cluster endpoints are used to access ApsaraDB for HBase over a VPC network and a public network. --> <property>  
        <name>hbase.zookeeper.quorum</name>
        <value>ld-xxxx-proxy-hbaseue.hbaseue.xxx.rds.aliyuncs.com:30020</value> 
</property> 
<!-- Specify the username and password. The default username and password are both root. --> <property>
       <name>hbase.client.username</name> 
       <value>root</value> 
</property>
          <property> <name>hbase.client.password</name> 
          <value>root</value>
     </property> 
  </configuration>
</configuration>

You can also specify the parameters by running the following commands on the Hive client:

set hbase.zookeeper.quorum=ld-xxxx-proxy-hbaseue.hbaseue.xxx.rds.aliyuncs.com:30020 
set hbase.client.username=root 
set hbase.client.password=root

How to use Hive

If the ApsaraDB for HBase table that you want to manage does not exist, you can run the command for creating a table in Hive. A Hive table and ApsaraDB for HBase table are created and automatically associated with each other.

  • Launch the Hive CLl.

    Launch the Hive CLI
  • Create an ApsaraDB for HBase table.

    CREATE TABLE hive_hbase_table(key int, value string) STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler' WITH SERDEPROPERTIES ("hbase.columns.mapping" = ":key,cf1:val") TBLPROPERTIES ("hbase.table.name" = "hive_hbase_table", "hbase.mapred.output.outputtable" = "hive_hbase_table");
  • Insert data into the ApsaraDB for HBase table by Hive CLI.

    insert into hive_hbase_table values(212,'bab');

    Insert data

  • View the ApsaraDB for HBase table. You can see the table has been created and the data has been inserted. view view1

  • Write data to the ApsaraDB for HBase table and check the data in Hive.

    Write

  • Query the data in Hive:

    Query

  • After you delete the table in Hive, the associated table in ApsaraDB for HBase is also deleted.

    delete

  • Query the table in ApsaraDB for HBase. An error message is returned indicating that the table does not exist.

    An error message

  • If the ApsaraDB for HBase table already exists, you can associate it with an external table in Hive. If you delete the external table, the ApsaraDB for HBase table will not be deleted.

  • Create an ApsaraDB for HBase table and use PUT to insert data into the table.

    table already exists
  • Create an external table that is associated with the ApsaraDB for HBase table in Hive and query data in Hive.

    Create an external table
  • Delete the external table in Hive. The associated ApsaraDB for HBase table still exists.

    Delete the external table

For more information, visit

https://cwiki.apache.org/confluence/display/Hive/HBaseIntegration

Note

If you associate Hive with snapshots of ApsaraDB for HBase Performance-enhanced Edition, you are unable to read HFiles by Hive. You can create a table in Hive and associate this Hive table with the ApsaraDB for HBase table or create an external table in Hive to associate with the existing ApsaraDB for HBase table. No matter which method you use, you can query the data in ApsaraDB for HBase by Hive.