All Products
Search
Document Center

E-MapReduce:Use Hive to access data in ApsaraDB for HBase

Last Updated:Mar 12, 2024

If you want to perform multi-table association analysis on data in Alibaba Cloud ApsaraDB for HBase, Hive is needed. This topic describes how to associate Hive in your E-MapReduce (EMR) cluster with Alibaba Cloud ApsaraDB for HBase tables.

Prerequisites

  • A DataLake cluster is created. For more information, see Create a cluster.
  • An ApsaraDB for HBase cluster is created in the virtual private cloud (VPC) where your EMR cluster resides.
Note In this example, the ApsaraDB for HBase Standard Edition V2.0 is used. The ApsaraDB for HBase Performance-enhanced Edition (Lindorm) is not supported.

Step 1: Add Hive configurations

  1. Go to the Configure tab.
    1. Log on to the EMR console. In the left-side navigation pane, click EMR on ECS.
    2. In the top navigation bar, select a region and a resource group based on your business requirements.
    3. On the EMR on ECS page, find the cluster to which you want to add Hive configurations and click Services in the Actions column.
    4. On the Services tab, find the Hive service and click Configure.
    5. Click the hbase-site.xml tab.
  2. Add the configuration item described in the following table.

    For more information about how to add a configuration item, see Add configuration items.

    Configuration itemDescription
    hbase.zookeeper.quorumEnter the ZooKeeper address of the ApsaraDB for HBase cluster in the VPC. Examples: hb-xxxx-master1-001.hbase.rds.aliyuncs.com:2181, hb-xxxx-master2-001.hbase.rds.aliyuncs.com:2181, and hb-xxxx-master3-001.hbase.rds.aliyuncs.com:2181.

Step 2: View HBase tables

Note For more information about how to use HBase Shell, see HBase Shell.
  1. Run the following command to connect to the ApsaraDB for HBase cluster:
    hbase shell
  2. Run the list command to check whether the HBase table hive_hbase_table or hbase_table exists.

Step 3: Create an external table in Hive and map the external table to the existing table in ApsaraDB for HBase

  1. Run the following command to create a table in ApsaraDB for HBase:
    create 'hbase_table','f'
  2. Insert data into the table.
    1. Run the following command to insert the first data record:
      put 'hbase_table','1122','f:col1','hello'
    2. Run the following command to insert the second data record:
      put 'hbase_table','1122','f:col2','hbase'
  3. Create an external table in Hive, map the external table to the table in ApsaraDB for HBase, and query data from the external table.
    1. Create an external table in Hive, and map the external table to the table in ApsaraDB for HBase.
      create external table hbase_table(key int,col1 string,col2 string) STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler' WITH SERDEPROPERTIES ("hbase.columns.mapping" = "f:col1,f:col2") TBLPROPERTIES ("hbase.table.name" = "hbase_table", "hbase.mapred.output.outputtable" = "hbase_table");
    2. Query data from the external table.
      select * from hbase_table;
      The following information is returned:
      1122    hello   hbase
    3. Delete the hbase_table table from Hive.
      drop table hbase_table;
    4. Run the list command in HBase Shell to check whether the hbase_table table exists.
      If the returned information shows that the table hbase_table exists, deleting the table in Hive does not affect the existing table in ApsaraDB for HBase.

Step 4: Create an internal table in ApsaraDB for HBase

  1. Enter the hive command to enter the Hive CLI.
  2. Run the following command to create a table in ApsaraDB for HBase:
    CREATE TABLE hive_hbase_table(key int, value string) STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler' WITH SERDEPROPERTIES ("hbase.columns.mapping" = ":key,cf1:val") TBLPROPERTIES ("hbase.table.name" = "hive_hbase_table", "hbase.mapred.output.outputtable" = "hive_hbase_table");
  3. Run the following command to insert data into the hive_hbase_table table:
    insert into hive_hbase_table values(212,'bab');
  4. Run the following command to view the table data in Hive:
    select * from hive_hbase_table
  5. Write data to the hive_hbase_table table and view the data in Hive.
    1. Run the following command to write data to the hive_hbase_table table:
      put 'hive_hbase_table','132','cf1:val','acb'
    2. Run the following command to view the data written to the table in Hive:
      select * from hive_hbase_table;
      The following information is returned:
      132 acb 212 bab
  6. Delete the hive_hbase_table table and view the hive_hbase_table table in ApsaraDB for HBase.
    1. Run the following command to delete the hive_hbase_table table from Hive:
      drop table hive_hbase_table;
    2. Run the following command to view the hive_hbase_table table in ApsaraDB for HBase:
      scan hive_hbase_table;
      The returned information indicates that the hive_hbase_table table does not exist.