ApsaraDB for HBase Performance-enhanced Edition allows you to connect to the database service by using Hive. However, Hive does not use the standard operations such as GET or PUT to call ApsaraDB for HBase, but calls the internal classes in ApsaraDB for HBase. Therefore, it is not possible to directly join the alihbase-connector JAR package for compatibility. It is necessary to replace the existing HBase jar package under the hive/lib
directory, rather than adding the alihbase-connector JAR file in the hive/lib directory.
Delete all jars starting with hbase in the
hive/lib
. In the figure, the red box is java file. Be careful not to delete thehive-hbase-handler-{version}.jar
, which is the logical code jar package for Hive to access HBase.Click here to download the jar package file of the alihbase compatible client. For all jar packages that start with alihbase-, other dependencies will be available. Can not put), all put in the
hive/lib
directory.If the hbase dependency was specified in the
hive/.hiverc
or using the--auxpath
parameter when starting Hive, the loaded jar package needs to be replaced with a JAR package starting with a new alihbase.
Retrieve an endpoint
For more information, see Use the Java API to access an enhanced edition cluster
Use the Java API to access an enhanced edition cluster.
Retrieve the username and password
The default username is root and the password is root. Or after the ACL function is turned off in the cluster management page, you do not need to provide the username and password.
Add the IP address of the server where the Hive is deployed to the ApsaraDB for HBase whitelist.
The IP addresses of all Hive machines that access HBase must be added to the whitelist of the HBase cluster. Otherwise, they cannot be accessed.
Configure connection parameters in Hive
There are two ways to configure the parameters for connecting HBase in Hive, one is to directly configure it in the hive-site.xml
file. Add the following configurations to this file:
<configuration> <!-- The endpoint of your ApsaraDB for HBase cluster. You can retrieve the cluster endpoint from the Database Connection page of the ApsaraDB for HBase console. Different cluster endpoints are used to access ApsaraDB for HBase over a VPC network and a public network. --> <property>
<name>hbase.zookeeper.quorum</name>
<value>ld-xxxx-proxy-hbaseue.hbaseue.xxx.rds.aliyuncs.com:30020</value>
</property>
<!-- Specify the username and password. The default username and password are both root. --> <property>
<name>hbase.client.username</name>
<value>root</value>
</property>
<property> <name>hbase.client.password</name>
<value>root</value>
</property>
</configuration>
You can also specify the parameters by running the following commands on the Hive client:
set hbase.zookeeper.quorum=ld-xxxx-proxy-hbaseue.hbaseue.xxx.rds.aliyuncs.com:30020
set hbase.client.username=root
set hbase.client.password=root
How to use Hive
If the ApsaraDB for HBase table that you want to manage does not exist, you can run the command for creating a table in Hive. A Hive table and ApsaraDB for HBase table are created and automatically associated with each other.
Launch the Hive CLl.
Create an ApsaraDB for HBase table.
CREATE TABLE hive_hbase_table(key int, value string) STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler' WITH SERDEPROPERTIES ("hbase.columns.mapping" = ":key,cf1:val") TBLPROPERTIES ("hbase.table.name" = "hive_hbase_table", "hbase.mapred.output.outputtable" = "hive_hbase_table");
Insert data into the ApsaraDB for HBase table by Hive CLI.
insert into hive_hbase_table values(212,'bab');
View the ApsaraDB for HBase table. You can see the table has been created and the data has been inserted.
Write data to the ApsaraDB for HBase table and check the data in Hive.
Query the data in Hive:
After you delete the table in Hive, the associated table in ApsaraDB for HBase is also deleted.
Query the table in ApsaraDB for HBase. An error message is returned indicating that the table does not exist.
If the ApsaraDB for HBase table already exists, you can associate it with an external table in Hive. If you delete the external table, the ApsaraDB for HBase table will not be deleted.
Create an ApsaraDB for HBase table and use PUT to insert data into the table.
Create an external table that is associated with the ApsaraDB for HBase table in Hive and query data in Hive.
Delete the external table in Hive. The associated ApsaraDB for HBase table still exists.
For more information, visit
https://cwiki.apache.org/confluence/display/Hive/HBaseIntegration
Note
If you associate Hive with snapshots of ApsaraDB for HBase Performance-enhanced Edition, you are unable to read HFiles by Hive. You can create a table in Hive and associate this Hive table with the ApsaraDB for HBase table or create an external table in Hive to associate with the existing ApsaraDB for HBase table. No matter which method you use, you can query the data in ApsaraDB for HBase by Hive.