This topic describes how to configure an HBase cluster and use the HBase storage service.

Prerequisites

An E-MapReduce (EMR) cluster is created, and the HBase service is added to the cluster. For more information, see Create a cluster.

Configure an HBase cluster

When you create an HBase cluster, you can turn on Custom Software Settings in the Advanced Settings section of the Software Settings step and modify the default HBase configurations. Example:
{
  "configurations": [
    {
      "classification": "hbase-site",
      "properties": {
        "hbase.hregion.memstore.flush.size": "268435456",
        "hbase.regionserver.global.memstore.size": "0.5",
        "hbase.regionserver.global.memstore.lowerLimit": "0.6"
      }
    }
  ]
}
The following table lists the default HBase configurations.
Key Value
zookeeper.session.timeout 180000
hbase.regionserver.global.memstore.size 0.35
hbase.regionserver.global.memstore.lowerLimit 0.3
hbase.hregion.memstore.flush.size 128MB

Access HBase

  1. Log on to the master node of the cluster by using SSH.
  2. Run the following command to access HBase Shell:
    hbase shell
    The following information is returned:
    SLF4J: Class path contains multiple SLF4J bindings.
    SLF4J: Found binding in [jar:file:/opt/apps/ecm/service/hbase/1.4.9-1.0.0/package/hbase-1.4.9-1.0.0/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]
    SLF4J: Found binding in [jar:file:/opt/apps/ecm/service/hadoop/2.8.5-1.5.3/package/hadoop-2.8.5-1.5.3/share/hadoop/common/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]
    SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
    SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
    HBase Shell
    Use "help" to get list of supported commands.
    Use "exit" to quit this interactive shell.
    Version 1.4.9, r8214a16c5d80f077abf1aa01bb312851511a2b15, Thu Jan 31 20:35:22 CST 2019
    
    hbase(main):001:0>
    Note If you use an instance created in the ECS console, see Use HBase Shell.

Examples

  • Use Spark to access HBase

    For more information, see spark-hbase-connector.

  • Use Hadoop to access HBase

    For more information, see HBase MapReduce Examples.

  • Use Hive to access HBase
    1. Log on to the master node of a Hive cluster and add the following information to the hosts file.
      $zk_ip emr-cluster // $zk_ip indicates the IP address of the ZooKeeper node in the HBase cluster. 
    2. For more information about how to perform Hive-related operations, see Hive HBase Integration.