A full-text index mapping connects an HBase table to a Search index. Each mapping defines which HBase columns to synchronize, how to encode the rowkey, and the data type of each column. This page covers how to create, update, and inspect mappings using HBase Shell.
Prerequisites
Before you begin, ensure that you have:
Read the Quick start guide
Downloaded and configured the latest version of HBase Shell
Create a mapping
Store the mapping configuration in a JSON file and pass it to HBase Shell. The following example maps two columns from testTable to the democollection Search index.
{
"sourceNamespace": "default",
"sourceTable": "testTable",
"targetIndexName": "democollection",
"indexType": "SOLR",
"rowkeyFormatterType": "STRING",
"fields": [
{
"source": "f:name",
"targetField": "name_s",
"type": "STRING"
},
{
"source": "f:age",
"targetField": "age_i",
"type": "INT"
}
]
}In this example, data from testTable is synchronized to democollection. The f:name column (column family and qualifier separated by :) maps to name_s in the index, and f:age maps to age_i.
Mapping parameters
| Parameter | Description |
|---|---|
sourceNamespace | The namespace of the HBase table. Leave blank or set to default if the table has no namespace. |
sourceTable | The HBase table name, without the namespace. |
targetIndexName | The name of the Search index. |
indexType | Fixed value: SOLR. |
rowkeyFormatterType | How the rowkey is encoded as the index document ID. Valid values: STRING, HEX. See Choose a rowkeyFormatterType. |
fields | A JSON array of column mappings. Separate multiple entries with commas. See Configure field mappings. |
Choose a rowkeyFormatterType
rowkeyFormatterType determines how the HBase rowkey is converted to the ID field in the Search index (which is always a string).
| Value | When to use | Encoding | Decoding |
|---|---|---|---|
STRING | Rowkey was written using Bytes.toBytes(String) — for example, row1, order0001, or the string "12345" | Bytes.toString(byte[]) | Bytes.toBytes(String) |
HEX | Rowkey is a numeric type, a composite key including non-string fields, or anything not written using Bytes.toBytes(String) | Hex.encodeAsString(byte[]) from org.apache.commons.codec.binary.Hex | Hex.decodeHex(String.toCharArray()) |
How to decide: The key question is how the rowkey bytes were originally written. If Bytes.toBytes(String) was used, choose STRING. For all other cases — including numeric rowkeys written with Bytes.toBytes(int) or Bytes.toBytes(long) — choose HEX.
If the data is not written to the HBase table by using the Bytes.toBytes(String) function, the data type is not considered as STRING. To import the rowkey to HBase, set this parameter to HEX. Otherwise, after you convert the ID of the document in the index back to bytes, the result may be different from the original rowkey.
Configure field mappings
Each entry in the fields array maps one HBase column to one Search index field.
| Parameter | Description |
|---|---|
source | The HBase column to map, in family:qualifier format. For example, f:name. |
targetField | The destination field name in the Search index. Dynamic columns — those with a type-indicating suffix — are identified automatically without requiring a predefined schema. |
type | The data type of the column as stored in HBase. Must match how the data was written using Bytes.toBytes(). Valid values (case-sensitive): INT, LONG, STRING, BOOLEAN, FLOAT, DOUBLE, SHORT, BIGDECIMAL. |
Dynamic columns
The Search service supports dynamic columns, which use a suffix naming convention to identify the field type automatically. You do not need to predefine each field in the managed_schema configuration set.
Common suffixes and their corresponding types:
| Suffix | Type |
|---|---|
_s | STRING |
_i | INT |
For example, name_s is automatically recognized as a STRING field, and age_i as an INT field. For the full list of supported suffixes, see Update the configuration set.
How HBase data types work
HBase stores all data as raw bytes. The type parameter tells the Search service how to interpret those bytes during synchronization.
int age = 25;
byte[] ageValue = Bytes.toBytes(age);
put.addColumn(Bytes.toBytes("f"), Bytes.toBytes("age"), ageValue);
String name = "25";
byte[] nameValue = Bytes.toBytes(name);
put.addColumn(Bytes.toBytes("f"), Bytes.toBytes("name"), nameValue);In this example, f:age is INT and f:name is STRING — even though both happen to hold the value 25. Setting the wrong type causes the Search service to call the wrong decode method (for example, Bytes.toInt() on a string column), resulting in garbled or failed data synchronization.
The source column type and the index field type do not need to match. For example, f:age can be stored as STRING in HBase while targetField points to age_i (INT) in the index. The Search service converts the value automatically. If the string value cannot be converted — for example, a non-numeric value in an INT field — an error occurs during synchronization.
Manage the schema
Use HBase Shell commands to view and modify the mapping after it is created.
View the current mapping
Run describe_external_index to get the full mapping schema in JSON format:
hbase(main):005:0> describe_external_index 'testTable'Modify the mapping
Use alter_external_index to replace the entire mapping schema. Place the schema.json file in the HBase Shell startup directory, or provide a relative or absolute path.
hbase(main):006:0> alter_external_index 'HBase table name', 'schema.json'Each alter_external_index call replaces the full schema. To add, update, or remove multiple columns at once, include all columns in a single JSON file.
To delete all field mappings from a table, set fields to an empty array:
{
"sourceNamespace": "default",
"sourceTable": "testTable",
"targetIndexName": "democollection",
"indexType": "SOLR",
"rowkeyFormatterType": "STRING",
"fields": []
}Add columns to the mapping
Use add_external_index_field to add individual columns without replacing the full schema:
hbase shell> add_external_index_field 'testTable', {FAMILY => 'f', QUALIFIER => 'money', TARGETFIELD => 'money_f', TYPE => 'FLOAT'}add_external_index_field only works on tables whose schema has already been modified using alter_external_index. For bulk changes involving many columns, use alter_external_index instead — it replaces the entire schema in a single operation.
Remove columns from the mapping
Use remove_external_index to delete specific column mappings:
hbase shell> remove_external_index 'testTable', 'f:name', 'f:age'What's next
Update the configuration set — configure the
managed_schemafor non-dynamic columns and review all supported dynamic field suffixes