The benchmark tests compare the throughput and response latency between HBase Community Edition and ApsaraDB for Lindorm.
The throughput benchmark test uses the same number of threads to test the throughput of HBase Community Edition and the throughput of ApsaraDB for Lindorm. The response latency benchmark test uses the same amount of workloads to test the response latency of HBase Community Edition and the response latency of ApsaraDB for Lindorm. The benchmark test for compression ratios writes the same amount of data into HBase Community Edition and ApsaraDB for Lindorm to test their compression ratios.
Create a table
Create tables in the clusters of HBase Community Edition and ApsaraDB for Lindorm. The tables used in all the test cases use the same schema. Create 200 partitions based on the Yahoo Cloud Serving Benchmark (YCSB) data.
The table created in the cluster of ApsaraDB for Lindorm uses the exclusive INDEX encoding and Zstandard compression algorithms. When you set the encoding algorithm to DIFF, the algorithm is automatically updated to the INDEX encoding algorithm.
create 'test', {NAME => 'f', DATA_BLOCK_ENCODING => 'DIFF', COMPRESSION => 'ZSTD'}, {SPLITS => (1..199).map{|i| "user#{(i * ((2**63-1)/199)).to_s.rjust(19, "0")}"} }
The table created in the cluster of HBase Community Edition uses the DIFF encoding and SNAPPY compression algorithms that are recommended by official HBase. You can execute the following statement to create the table:
create 'test', {NAME => 'f', DATA_BLOCK_ENCODING => 'DIFF', COMPRESSION => 'SNAPPY'}, {SPLITS => (1..199).map{|i| "user#{(i * ((2**63-1)/199)).to_s.rjust(19, "0")}"} }
Prepare the test data
Prepare the data to be read from a single row and within a specified range.
A single table stores 2 billion rows. Each row stores 20 columns. The size of each column is 20 bytes.
The following code block shows the YCSB profile:
recordcount=2000000000
operationcount=150000000
workload=com.yahoo.ycsb.workloads.CoreWorkload
readallfields=false
fieldcount=20
fieldlength=20
readproportion=1.0
updateproportion=0.0
scanproportion=0
insertproportion=0
requestdistribution=uniform
Run the following command to launch YCSB:
bin/ycsb load hbase10 -P <workload> -p table=test -threads 200 -p columnfamily=f -s
Throughput benchmark tests
The throughput benchmark tests compare the throughput of HBase Community Edition with that of ApsaraDB for Lindorm based on the same number of threads. The tests include four test scenarios. The test scenarios are independent of each other.
- Read data in a single row
The test dataset stores 2 billion rows. Each row stores 20 columns. The size of each column is 20 bytes. The query range is 10 million rows. After the preceding data is prepared, perform a major compaction and wait for the system to complete the major compaction. Run a warm-up test for 20 minutes, and then run a formal test for 20 minutes.
The following code block shows the workload configuration in the YCSB profile:
recordcount=10000000 operationcount=2000000000 workload=com.yahoo.ycsb.workloads.CoreWorkload readallfields=false fieldcount=1 fieldlength=20 readproportion=1.0 updateproportion=0.0 scanproportion=0 insertproportion=0 requestdistribution=uniform
Run the following command to launch YCSB:
bin/ycsb run hbase10 -P <workload> -p table=test -threads 200 -p columnfamily=f -p maxexecutiontime=1200
- Read data within a specified range
The test dataset stores 2 billion rows. Each row stores 20 columns. The size of each column is 20 bytes. The query range is 10 million rows. Fifty rows are read each time. After the preceding data is prepared, perform a major compaction and wait for the system to complete the major compaction. Run a warm-up test for 20 minutes, and then run a formal test for 20 minutes.
The following code block shows the workload configuration in the YCSB profile:
recordcount=10000000 operationcount=2000000000 workload=com.yahoo.ycsb.workloads.CoreWorkload readallfields=false fieldcount=1 fieldlength=20 readproportion=0.0 updateproportion=0.0 scanproportion=1.0 insertproportion=0 requestdistribution=uniform maxscanlength=50 Lindorm.usepagefilter=false
Run the following command to launch YCSB:
bin/ycsb run hbase10 -P <workload> -p table=test -threads 100 -p columnfamily=f -p maxexecutiontime=1200
- Write data into a single row
Insert one column into the table at a time. The size of each column is 20 bytes. Run the test for 20 minutes.
The following code block shows the workload configuration in the YCSB profile:
recordcount=2000000000 operationcount=100000000 workload=com.yahoo.ycsb.workloads.CoreWorkload readallfields=false fieldcount=1 fieldlength=20 readproportion=0.0 updateproportion=0.0 scanproportion=0 insertproportion=1.0 requestdistribution=uniform
Run the following command to launch YCSB:
bin/ycsb run hbase10 -P <workload> -p table=test -threads 200 -p columnfamily=f -p maxexecutiontime=1200
- Write data into multiple rows
Insert one column into the table at a time. The size of each column is 20 bytes. Write data into 100 rows per batch. Run the test for 20 minutes.
recordcount=2000000000 operationcount=10000000 workload=com.yahoo.ycsb.workloads.CoreWorkload fieldcount=1 fieldlength=20 cyclickey=true readallfields=false readproportion=0 updateproportion=0 scanproportion=0 insertproportion=0.0 batchproportion=1.0 batchsize=100 requestdistribution=uniform
Run the following command to launch YCSB:
bin/ycsb run hbase10 -P <workload> -p table=test -threads 100 -p columnfamily=f -p maxexecutiontime=1200
Response latency benchmark tests
The response latency benchmark tests compare the response latency of HBase Community Edition with that of ApsaraDB for Lindorm based on the same Operations per Second (OPS).
- Read data in a single row
The test dataset stores 2 billion rows. Each row stores 20 columns. The size of each column is 20 bytes. The query range is 10 million rows. The maximum OPS is 5000. After the preceding data is prepared, perform a major compaction and wait for the system to complete the major compaction. Run a warm-up test for 20 minutes, and then run a formal test for 20 minutes.
The following code block shows the workload configuration in the YCSB profile:
recordcount=10000000 operationcount=2000000000 workload=com.yahoo.ycsb.workloads.CoreWorkload readallfields=false fieldcount=1 fieldlength=20 readproportion=1.0 updateproportion=0.0 scanproportion=0 insertproportion=0 requestdistribution=uniform
Run the following command to launch YCSB:
bin/ycsb run hbase10 -P <workload> -p table=test -threads 200 -p columnfamily=f -p maxexecutiontime=1200 -p target=5000
- Read data within a specified range
The test dataset stores 2 billion rows. Each row stores 20 columns. The size of each column is 20 bytes. The query range is 10 million rows. 50 rows are read at a time. The maximum OPS is 5000. After the preceding data is prepared, perform a major compaction and wait for the system to complete the major compaction. Run a warm-up test for 20 minutes, and then run a formal test for 20 minutes.
The following code block shows the workload configuration in the YCSB profile:
recordcount=10000000 operationcount=2000000000 workload=com.yahoo.ycsb.workloads.CoreWorkload readallfields=false fieldcount=1 fieldlength=20 readproportion=0.0 updateproportion=0.0 scanproportion=1.0 insertproportion=0 requestdistribution=uniform maxscanlength=50 Lindorm.usepagefilter=false
Run the following command to launch YCSB:
bin/ycsb run hbase10 -P <workload> -p table=test -threads 100 -p columnfamily=f -p maxexecutiontime=1200 -p target=5000
- Write data into a single row
Insert one column into the table at a time. The size of each column is 20 bytes. Run the test for 20 minutes. The maximum OPS is 50000.
The following code block shows the workload configuration in the YCSB profile:
recordcount=2000000000 operationcount=100000000 workload=com.yahoo.ycsb.workloads.CoreWorkload readallfields=false fieldcount=1 fieldlength=20 readproportion=0.0 updateproportion=0.0 scanproportion=0 insertproportion=1.0 requestdistribution=uniform
Run the following command to launch YCSB:
bin/ycsb run hbase10 -P <workload> -p table=testwrite -threads 200 -p columnfamily=f -p maxexecutiontime=1200 -p target=50000
- Write data into multiple rows
Insert one column into the table at a time. The size of each column is 20 bytes. Write data into 100 rows per batch. Run the test for 20 minutes. The maximum OPS is 2000.
recordcount=2000000000 operationcount=10000000 workload=com.yahoo.ycsb.workloads.CoreWorkload fieldcount=1 fieldlength=20 cyclickey=true readallfields=false readproportion=0 updateproportion=0 scanproportion=0 insertproportion=0.0 batchproportion=1.0 batchsize=100 requestdistribution=uniform
Run the following command to launch YCSB:
bin/ycsb run hbase10 -P <workload> -p table=testwrite -threads 100 -p columnfamily=f -p maxexecutiontime=1200 -p target=2000
Benchmark tests for compression ratios
The following compression ratio benchmark tests all follow the same procedure. Manually trigger a flush and major compaction by inserting 5 million rows into the table through YCSB. After the data is inserted into the table, check the size of the table.
Number of columns in each row | Size of each column |
---|---|
1 | 10 |
1 | 100 |
20 | 10 |
20 | 20 |
The following code block shows the workload configuration in the YCSB profile:
recordcount=5000000
operationcount=150000000
workload=com.yahoo.ycsb.workloads.CoreWorkload
readallfields=false
fieldcount=<Number of columns in each row>
fieldlength=<Size of each column>
readproportion=1.0
requestdistribution=uniform
Run the following command to insert data:
bin/ycsb load hbase10 -P <workload> -p table=test -threads 200 -p columnfamily=f -s