This topic describes how to configure an HBase cluster and use the HBase storage service.
Prerequisites
Configure an HBase cluster
When you create an HBase cluster, you can turn on Custom Software Settings in the Advanced Settings section of the Software Settings step and modify the default HBase configurations. Example:
{
"configurations": [
{
"classification": "hbase-site",
"properties": {
"hbase.hregion.memstore.flush.size": "268435456",
"hbase.regionserver.global.memstore.size": "0.5",
"hbase.regionserver.global.memstore.lowerLimit": "0.6"
}
}
]
}
The following table lists the default HBase configurations.
Key | Value |
---|---|
zookeeper.session.timeout | 180000 |
hbase.regionserver.global.memstore.size | 0.35 |
hbase.regionserver.global.memstore.lowerLimit | 0.3 |
hbase.hregion.memstore.flush.size | 128MB |
Access HBase
Examples
- Use Spark to access HBase
For more information, see spark-hbase-connector.
- Use Hadoop to access HBase
For more information, see HBase MapReduce Examples.
- Use Hive to access HBase
- Log on to the master node of a Hive cluster and add the following information to the
hosts file.
$zk_ip emr-cluster // $zk_ip indicates the IP address of the ZooKeeper node in the HBase cluster.
- For more information about how to perform Hive-related operations, see Hive HBase Integration.
- Log on to the master node of a Hive cluster and add the following information to the
hosts file.