Principles and best practices for rowkey design - ApsaraDB for HBase

A rowkey is the unique identifier for each row in an HBase table. It controls how data is stored, partitioned, and accessed. Design rowkeys carefully before writing data at scale.

This topic covers five design considerations, with tradeoffs and examples for log data and transaction data.

How rowkeys work

Query methods

HBase supports two query methods, each with different constraints on rowkey design:

Method	Description	Constraint
GET	Looks up a single row by its complete rowkey	All fields that make up the rowkey must be known
Scan	Reads a range of rows between a start key and an end key	Only prefix-based ranges are supported

Prefix constraint for scans: A scan matches rows that start with a given prefix, but cannot query by suffix or match values in the middle of a rowkey. For example, if rowkeys are dictionary words, a scan can find all words starting with pre, but cannot find words ending with ing.

For queries that cannot be expressed as a prefix scan, use one of the following approaches:

Create an index table with an inverted key structure
Apply a server-side filter to discard unwanted rows
Use secondary indexes

Rowkey uniqueness and versions

Rows with the same rowkey are treated as a single record with multiple versions. By default, a GET returns the latest version. Rowkeys must be unique unless you are intentionally using Multi-Version Concurrency Control (MVCC).

Use a rowkey like a database primary key. It can be a single field or a composite of multiple fields:

[user_id] — one record per user
[user_id][order_id] — multiple records per user

Design considerations

Data distribution: avoid hot spots

HBase distributes rows across Region servers by rowkey range (lexicographic order). If many writes share a common prefix — for example, a timestamp-first key like 2024-01-01T00:00:01 — all writes land on the same Region server. This creates a hot spot that degrades write throughput and leaves other servers idle.

Use one of the following techniques to spread writes across Region servers:

Salting with a hash prefix

Prepend the first few characters of an MD5 hash to the rowkey. Because the hash is deterministic, the same input always maps to the same prefix, so reads remain efficient.

[md5(user_id).subStr(0, 4)][user_id][order_id]

Tradeoff: rows for the same user are spread across different Regions. Scanning a range for a single user requires multiple targeted GETs or a scan with a filter.

Reversing the key

Reverse the high-cardinality prefix field. For example, reversing a user ID that increments over time randomizes the leading bytes.

[reverse(user_id)][order_id]

Tradeoff: natural ordering is lost, so range scans on the reversed field are not meaningful.

Bucketing with modulo

Assign each row to a bucket using a modulo operation, then prepend the bucket number. This is effective for time series data where timestamps are monotonically increasing.

long bucket = timestamp % numBuckets;
[bucket][timestamp][hostname][log_event]

Tradeoff: to retrieve all data for a time range, you must scan all numBuckets ranges and merge the results.

Adding a random suffix

Append a random number to distribute writes across multiple rows.

[user_id][order_id][random(100)]

Tradeoff: reading a specific record requires knowing the random suffix. Point lookups are impractical without an index.

How to choose: If you need to scan across the distributed rows (not just look up individual records), use hashing rather than random suffixes — hashing is deterministic, so reads can be routed efficiently.

Rowkey length: keep it short

Rowkeys are stored with every column value in HBase. A long rowkey multiplies storage overhead across every column in every row. Keep rowkeys as short as possible:

Replace strings with numeric types. A long takes 8 bytes; the string "2015122410" takes 10 bytes, and an MD5 string takes 32 bytes. Use Long(2015122410) instead of "2015122410".
Use codes instead of full names. For example, use tb instead of "Taobao".

Field boundary clarity: prevent partial matches

When a rowkey combines multiple fields without delimiters, a scan range may return extra rows. For example, if the rowkey is [column1][column2][column3] and you scan from host1 to host2, the row host12... also falls in that range.

Two approaches prevent this:

Fixed-length padding: Pad each field to a fixed width so boundaries are unambiguous.

[rpad(column1, 'x', 20)][column2]

Delimiter: Separate fields with a delimiter character.

[column1][_][column2]

Fixed-length padding is more efficient for scans. Delimiters are easier to read.

Descending order: use reverse timestamps

By default, HBase scans return rows in ascending key order. If you need the most recent entries first, two options are available:

Option 1: Reverse scan API (scan.setReverse(true))

Simpler to implement, but reverse scans perform worse than forward scans. Use this when descending order is an occasional requirement.

Option 2: Reverse timestamp in the rowkey

Store Long.MAX_VALUE - timestamp instead of the raw timestamp. This inverts the natural sort order so that newer entries appear first in a forward scan.

timestamp = Long.MAX_VALUE - timestamp;
[hostname][log_event][timestamp]

Use this when descending order is the primary access pattern and scan performance is critical.

Design examples

The right rowkey design depends on your primary access patterns. The same dataset can require a different design depending on how it is queried. The examples below show how access patterns drive design decisions.

Log data and time series data

The data elements are: hostname, log_event, timestamp.

Access pattern	Rowkey design	Notes
Query a metric for a host over a time range	`[hostname][log_event][timestamp]`	Efficient for range scans per host. May create hot spots if a single host dominates writes
Query the most recent records for a host	`[hostname][log_event][Long.MAX_VALUE - timestamp]`	Reverse timestamp puts the latest entries first in a forward scan
Distribute writes evenly across time (large data volumes or no dominant host)	`[bucket][timestamp][hostname][log_event]` where `bucket = timestamp % numBuckets`	Requires scanning all bucket ranges to aggregate results for a time range

How to choose: Start with [hostname][log_event][timestamp] if per-host range queries are the primary use case. Switch to the bucket pattern if write hot spots appear or if time-range queries span many hosts.

Transaction data

A transaction involves three roles: a buyer, a seller, and an order number. Different access patterns require different rowkey designs — and often multiple tables.

Access pattern	Table	Rowkey design
Query a seller's orders in a time range	Seller table	`[seller_id][timestamp][order_number]`
Query a buyer's orders in a time range	Buyer table	`[buyer_id][timestamp][order_number]`
Look up an order by order number	Index table	`[order_number]`

Design all three tables to cover all three access patterns. Use the index table to look up the order_number, then query the buyer or seller table with that value.

What's next

HBase data model overview
Secondary indexes in HBase
Performance tuning for HBase tables