These benchmark tests compare throughput, response latency, and compression ratio between an Apache HBase cluster and a Lindorm cluster using Yahoo Cloud Serving Benchmark (YCSB).
Test workflow:
Create tables in both clusters using the same schema.
Load 2 billion rows of data using YCSB.
Run throughput tests across four scenarios with the same number of threads.
Run response latency tests across the same four scenarios with a fixed OPS target.
Run compression ratio tests across four column/size combinations.
Create tables
Both clusters use the same table schema with 200 pre-split partitions based on YCSB data.
For instructions on using Lindorm Shell to create tables, see Use Lindorm Shell to connect to LindormTable.
Lindorm cluster — uses INDEX encoding (a Lindorm-exclusive algorithm activated by setting DATA_BLOCK_ENCODING to DIFF) and Zstandard (ZSTD) compression:
create 'test', {NAME => 'f', DATA_BLOCK_ENCODING => 'DIFF', COMPRESSION => 'ZSTD'}, {SPLITS => (1..199).map{|i| "user#{(i * ((2**63-1)/199)).to_s.rjust(19, "0")}"} }Apache HBase cluster — uses DIFF encoding and SNAPPY compression as recommended by Apache HBase:
create 'test', {NAME => 'f', DATA_BLOCK_ENCODING => 'DIFF', COMPRESSION => 'SNAPPY'}, {SPLITS => (1..199).map{|i| "user#{(i * ((2**63-1)/199)).to_s.rjust(19, "0")}"} }Load data
Each table contains 2 billion rows, 20 columns per row, and 20 bytes per column.
YCSB profile:
recordcount=2000000000
operationcount=150000000
workload=com.yahoo.ycsb.workloads.CoreWorkload
readallfields=false
fieldcount=20
fieldlength=20
readproportion=1.0
updateproportion=0.0
scanproportion=0
insertproportion=0
requestdistribution=uniformRun the following command to load data:
bin/ycsb load hbase10 -P <workload> -p table=test -threads 200 -p columnfamily=f -sThroughput test
The throughput test runs each scenario with the same number of threads on both clusters. The four scenarios are independent of each other.
All scenarios use maxexecutiontime=1200 (20-minute run). For read scenarios, run a 20-minute warm-up before the formal test; trigger a major compaction and wait for it to complete before starting.
Single-row read
Simulates high-concurrency point lookup. Reads one row at a time from a 10-million-row query range within the 2-billion-row dataset. The query range (recordcount=10000000) is smaller than the total dataset to simulate a hot-spot read pattern against a realistic data volume.
| Parameter | Value |
|---|---|
| Rows in dataset | 2 billion |
| Query range | 10 million rows |
| Columns per row | 20 |
| Column size | 20 bytes |
| Threads | 200 |
| Warm-up | 20 minutes |
| Formal test | 20 minutes |
YCSB profile:
recordcount=10000000
operationcount=2000000000
workload=com.yahoo.ycsb.workloads.CoreWorkload
readallfields=false
fieldcount=20
fieldlength=20
readproportion=1.0
updateproportion=0.0
scanproportion=0
insertproportion=0
requestdistribution=uniformStress test command:
bin/ycsb run hbase10 -P <workload> -p table=test -threads 200 -p columnfamily=f -p maxexecutiontime=1200Range scan
Simulates batch scan workloads. Reads 50 consecutive rows per scan from a 10-million-row query range.
| Parameter | Value |
|---|---|
| Rows in dataset | 2 billion |
| Query range | 10 million rows |
| Columns per row | 20 |
| Column size | 20 bytes |
| Rows per scan | 50 |
| Threads | 100 |
| Warm-up | 20 minutes |
| Formal test | 20 minutes |
YCSB profile:
recordcount=10000000
operationcount=2000000000
workload=com.yahoo.ycsb.workloads.CoreWorkload
readallfields=false
fieldcount=20
fieldlength=20
readproportion=0.0
updateproportion=0.0
scanproportion=1.0
insertproportion=0
requestdistribution=uniform
maxscanlength=50
Lindorm.usepagefilter=falseStress test command:
bin/ycsb run hbase10 -P <workload> -p table=test -threads 100 -p columnfamily=f -p maxexecutiontime=1200Single-row insert
Simulates high-frequency single-row write workloads. Inserts one column (20 bytes) per operation.
| Parameter | Value |
|---|---|
| Columns per insert | 1 |
| Column size | 20 bytes |
| Threads | 200 |
| Test duration | 20 minutes |
YCSB profile:
recordcount=2000000000
operationcount=100000000
workload=com.yahoo.ycsb.workloads.CoreWorkload
readallfields=false
fieldcount=1
fieldlength=20
readproportion=0.0
updateproportion=0.0
scanproportion=0
insertproportion=1.0
requestdistribution=uniformStress test command:
bin/ycsb run hbase10 -P <workload> -p table=test -threads 200 -p columnfamily=f -p maxexecutiontime=1200Batch insert
Simulates bulk write workloads. Inserts one column (20 bytes) per operation in batches of 100 rows.
| Parameter | Value |
|---|---|
| Columns per insert | 1 |
| Column size | 20 bytes |
| Batch size | 100 rows |
| Threads | 100 |
| Test duration | 20 minutes |
YCSB profile:
recordcount=2000000000
operationcount=10000000
workload=com.yahoo.ycsb.workloads.CoreWorkload
fieldcount=1
fieldlength=20
cyclickey=true
readallfields=false
readproportion=0
updateproportion=0
scanproportion=0
insertproportion=0.0
batchproportion=1.0
batchsize=100
requestdistribution=uniformStress test command:
bin/ycsb run hbase10 -P <workload> -p table=test -threads 100 -p columnfamily=f -p maxexecutiontime=1200Response latency test
The response latency test uses the same workload configurations as the throughput test, but adds a -p target=<N> flag to cap OPS at a fixed value. This keeps the load identical across both clusters, so latency differences reflect cluster performance rather than load variation.
Single-row read
| Parameter | Value |
|---|---|
| Query range | 10 million rows |
| Columns per row | 20 |
| Column size | 20 bytes |
| Threads | 200 |
| Max OPS | 5,000 |
| Warm-up | 20 minutes |
| Formal test | 20 minutes |
YCSB profile: same as single-row read throughput test.
Stress test command:
bin/ycsb run hbase10 -P <workload> -p table=test -threads 200 -p columnfamily=f -p maxexecutiontime=1200 -p target=5000Range scan
| Parameter | Value |
|---|---|
| Query range | 10 million rows |
| Rows per scan | 50 |
| Threads | 100 |
| Max OPS | 5,000 |
| Warm-up | 20 minutes |
| Formal test | 20 minutes |
YCSB profile: same as range scan throughput test.
Stress test command:
bin/ycsb run hbase10 -P <workload> -p table=test -threads 100 -p columnfamily=f -p maxexecutiontime=1200 -p target=5000Single-row insert
| Parameter | Value |
|---|---|
| Columns per insert | 1 |
| Column size | 20 bytes |
| Threads | 200 |
| Max OPS | 50,000 |
| Test duration | 20 minutes |
YCSB profile: same as single-row insert throughput test.
Stress test command:
bin/ycsb run hbase10 -P <workload> -p table=testwrite -threads 200 -p columnfamily=f -p maxexecutiontime=1200 -p target=50000Batch insert
| Parameter | Value |
|---|---|
| Columns per insert | 1 |
| Column size | 20 bytes |
| Batch size | 100 rows |
| Threads | 100 |
| Max OPS | 2,000 |
| Test duration | 20 minutes |
YCSB profile: same as batch insert throughput test.
Stress test command:
bin/ycsb run hbase10 -P <workload> -p table=testwrite -threads 100 -p columnfamily=f -p maxexecutiontime=1200 -p target=2000Compression ratio test
Each compression ratio test loads 5 million rows into both clusters, then triggers a flush and major compaction. After compaction completes, compare the on-disk table size between both clusters.
Run this procedure for each of the following column configurations:
| Columns per row | Column size (bytes) |
|---|---|
| 1 | 10 |
| 20 | 10 |
| 20 | 20 |
| 100 | 10 |
YCSB profile (replace <fieldcount> and <fieldlength> with values from the table above):
recordcount=5000000
operationcount=150000000
workload=com.yahoo.ycsb.workloads.CoreWorkload
readallfields=false
fieldcount=<fieldcount>
fieldlength=<fieldlength>
readproportion=1.0
requestdistribution=uniformLoad data command:
bin/ycsb load hbase10 -P <workload> -p table=test -threads 200 -p columnfamily=f -sAfter loading, trigger a flush and major compaction manually, then check the table size in both clusters.