All Products
Search
Document Center

E-MapReduce:Access Hbase

Last Updated:Mar 26, 2026

E-MapReduce (EMR) lets you configure HBase cluster settings at creation time and access HBase through the shell or big data frameworks.

Prerequisites

Before you begin, ensure that you have:

Configure an HBase cluster

Customize HBase settings when you create a cluster. On the Software Settings step, expand Advanced Settings, turn on Custom Software Settings, and provide your configuration in the following JSON format:

{
  "configurations": [
    {
      "classification": "hbase-site",
      "properties": {
        "hbase.hregion.memstore.flush.size": "268435456",
        "hbase.regionserver.global.memstore.size": "0.5",
        "hbase.regionserver.global.memstore.lowerLimit": "0.6"
      }
    }
  ]
}

The following table lists the default values for common HBase parameters.

Parameter Default value When to adjust
zookeeper.session.timeout 180000 Increase if you see frequent ZooKeeper session timeouts under heavy load.
hbase.regionserver.global.memstore.size 0.35 Increase (for example, to 0.5) for write-heavy workloads to keep more data in memory before flushing. Keep lowerLimit below this value.
hbase.regionserver.global.memstore.lowerLimit 0.3 Set to a value lower than hbase.regionserver.global.memstore.size to provide headroom before forced flushes.
hbase.hregion.memstore.flush.size 128MB Increase for workloads with short bursts of writes so that all writes stay in memory during the burst and are flushed together, reducing I/O overhead. The value in the JSON example (268435456) equals 256 MB.

Access HBase Shell

  1. Connect to the master node of your cluster over SSH. See Log on to a cluster.

  2. Start HBase Shell:

    hbase shell

    A successful start produces output similar to the following. The SLF4J binding warnings are expected and do not affect functionality.

    SLF4J: Class path contains multiple SLF4J bindings.
    SLF4J: Found binding in [jar:file:/opt/apps/ecm/service/hbase/1.4.9-1.0.0/package/hbase-1.4.9-1.0.0/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]
    SLF4J: Found binding in [jar:file:/opt/apps/ecm/service/hadoop/2.8.5-1.5.3/package/hadoop-2.8.5-1.5.3/share/hadoop/common/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]
    SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
    SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
    HBase Shell
    Use "help" to get list of supported commands.
    Use "exit" to quit this interactive shell.
    Version 1.4.9, r8214a16c5d80f077abf1aa01bb312851511a2b15, Thu Jan 31 20:35:22 CST 2019
    
    hbase(main):001:0>
  3. Run basic HBase Shell commands to verify the connection and explore your data. Create a table The following command creates a table named contacts with two column families: personal and office.

    hbase(main):001:0> create 'contacts', 'personal', 'office'

    Write data

    hbase(main):002:0> put 'contacts', '1000', 'personal:name', 'John Dole'
    hbase(main):003:0> put 'contacts', '1000', 'personal:phone', '1-425-000-0001'
    hbase(main):004:0> put 'contacts', '1000', 'office:phone', '1-425-000-0002'
    hbase(main):005:0> put 'contacts', '1000', 'office:address', '1111 San Gabriel Dr.'

    Read data

    hbase(main):006:0> get 'contacts', '1000'

    Scan the table

    hbase(main):007:0> scan 'contacts'

    Delete the table

    hbase(main):008:0> disable 'contacts'
    hbase(main):009:0> drop 'contacts'

    Run help to see all available commands, or exit to quit HBase Shell.

Note If you are using an instance created in the ECS console instead of an EMR cluster, see Use HBase Shell to access an ApsaraDB for HBase Standard Edition cluster.

Access HBase from big data frameworks

Use Spark to access HBase

Use the spark-hbase-connector library to read from and write to HBase tables in Spark jobs.

Use Hadoop to access HBase

Use MapReduce to process HBase data. See HBase MapReduce examples for complete code samples.

Use Hive to access HBase

  1. Log in to the master node of the Hive cluster and add the ZooKeeper node's IP address to the hosts file. Replace $zk_ip with the actual IP address of the ZooKeeper node in the HBase cluster.

    $zk_ip emr-cluster
  2. Follow the Hive HBase Integration guide to complete the integration.