UnixBench Score: An Introduction

By Chao Qian (Xixie)

UnixBench is a benchmark suite that provides a basic indicators of the performance of a Unix-like system. It runs multiple tests on a system and compares the scores from a baseline system to produce an index value, which is combined to make an overall index for the system.

This article introduces UnixBench, focusing on how it calculates scores.

Running Parameters

Many customers directly run UnixBench after installation and use the last score for comparison.

The following is the sample result of a 4 core 8 GB CPU:

------------------------------------------------------------------------
Benchmark run: Mon, June 25, 2018 20:25:47 - 20:54:19
4 CPUs in system; running 1 parallel copy of tests

Dhrystone 2 using register variables       30971628.9 lps   (10.0 s, 7 samples)
Double-Precision Whetstone                     3928.1 MWIPS (12.4 s, 7 samples)
Execl Throughput                               3117.6 lps   (30.0 s, 2 samples)
File Copy 1024 bufsize 2000 maxblocks        645027.2 KBps  (30.0 s, 2 samples)
File Copy 256 bufsize 500 maxblocks          229505.4 KBps  (30.0 s, 2 samples)
File Copy 4096 bufsize 8000 maxblocks       1288742.6 KBps  (30.0 s, 2 samples)
Pipe Throughput                             1635960.9 lps   (10.0 s, 7 samples)
Pipe-based Context Switching                 337333.8 lps   (10.0 s, 7 samples)
Process Creation                               8238.2 lps   (30.0 s, 2 samples)
Shell Scripts (1 concurrent)                   5817.0 lpm   (60.0 s, 2 samples)
Shell Scripts (8 concurrent)                   2183.9 lpm   (60.0 s, 2 samples)
System Call Overhead                        2465754.7 lps   (10.0 s, 7 samples)

System Benchmarks Index Values               BASELINE       RESULT    INDEX
Dhrystone 2 using register variables         116700.0   30971628.9   2654.0
Double-Precision Whetstone                       55.0       3928.1    714.2
Execl Throughput                                 43.0       3117.6    725.0
File Copy 1024 bufsize 2000 maxblocks          3960.0     645027.2   1628.9
File Copy 256 bufsize 500 maxblocks            1655.0     229505.4   1386.7
File Copy 4096 bufsize 8000 maxblocks          5800.0    1288742.6   2222.0
Pipe Throughput                               12440.0    1635960.9   1315.1
Pipe-based Context Switching                   4000.0     337333.8    843.3
Process Creation                                126.0       8238.2    653.8
Shell Scripts (1 concurrent)                     42.4       5817.0   1371.9
Shell Scripts (8 concurrent)                      6.0       2183.9   3639.9
System Call Overhead                          15000.0    2465754.7   1643.8
                                                                   ========
System Benchmarks Index Score                                        1362.9

------------------------------------------------------------------------
Benchmark Run: Mon, June 25, 2018 20:54:19 - 21:22:54
4 CPUs in system; running 4 parallel copies of tests

Dhrystone 2 using register variables      114984418.6 lps   (10.0 s, 7 samples)
Double-Precision Whetstone                    16614.2 MWIPS (11.6 s, 7 samples)
Execl Throughput                              13645.3 lps   (30.0 s, 2 samples)
File Copy 1024 bufsize 2000 maxblocks        751698.4 KBps  (30.0 s, 2 samples)
File Copy 256 bufsize 500 maxblocks          230211.7 KBps  (30.0 s, 2 samples)
File Copy 4096 bufsize 8000 maxblocks       1964420.6 KBps  (30.0 s, 2 samples)
Pipe Throughput                             5999380.0 lps   (10.0 s, 7 samples)
Pipe-based Context Switching                1095000.2 lps   (10.0 s, 7 samples)
Process Creation                              34454.9 lps   (30.0 s, 2 samples)
Shell Scripts (1 concurrent)                  18218.1 lpm   (60.0 s, 2 samples)
Shell Scripts (8 concurrent)                   2493.1 lpm   (60.0 s, 2 samples)
System Call Overhead                        5643267.3 lps   (10.0 s, 7 samples)

System Benchmarks Index Values               BASELINE       RESULT    INDEX
Dhrystone 2 using register variables         116700.0  114984418.6   9853.0
Double-Precision Whetstone                       55.0      16614.2   3020.8
Execl Throughput                                 43.0      13645.3   3173.3
File Copy 1024 bufsize 2000 maxblocks          3960.0     751698.4   1898.2
File Copy 256 bufsize 500 maxblocks            1655.0     230211.7   1391.0
File Copy 4096 bufsize 8000 maxblocks          5800.0    1964420.6   3386.9
Pipe Throughput                               12440.0    5999380.0   4822.7
Pipe-based Context Switching                   4000.0    1095000.2   2737.5
Process Creation                                126.0      34454.9   2734.5
Shell Scripts (1 concurrent)                     42.4      18218.1   4296.7
Shell Scripts (8 concurrent)                      6.0       2493.1   4155.1
System Call Overhead                          15000.0    5643267.3   3762.2
                                                                   ========
System Benchmarks Index Score                                        3357.0

Many people use 3357 for comparison and pay little attention to the single-process score 1362.9, which is also significant. The multi-process result is based on the number of CPU cores, while the single-process result is based on only one process. Both results show the system performance.

Then, what has been executed with . /Run? By default, the Index module is executed, which contains the following applets:

    "dhry2reg", "whetstone-double", "execl"，
    "fstime", "fsbuffer", "fsdisk", "pipe", "context1", "spawn", "shell1", "shell8","syscall"

They correspond to the above results.

The source code indicates that the following parameters are executed: /Run Module -i Number of iterations -c Number of concurrent processes -q/-v (output mode)

Module: If you want only the test result of an applet, you can specify the module and modify the test content, which is convenient for debugging. You can also specify the execution program.
Number of iterations: By default, the framework supports two iteration modes, namely short iterations (3 iterations by default) and long iterations (10 iterations by default). The input number of iterations indicates the number of iterations by default in long iterations. In short iterations, the number of iterations is (Input number of iterations + 1)/3. If the result is less than 1, 1 is used. After the applet is configured, select the number of iterations.
Number of concurrent processes: If you want the number of concurrent processes to be different from the number of system CPUs, set this parameter.
Output mode: Select the Silent or Details output mode.

Score Calculation

When running each applet, you need to specify the iteration mode, execution time, and concurrency. The same mode is applied to all the processes each time.

Score Calculation with a Single Process

The following is an example of score calculation with a single process:

Process Creation -- 1 copy
==> "/opt/unixbench/UnixBench/pgms/spawn" 30 2>&1 >> "/opt/unixbench/UnixBench/results/VM_0_13_centos-2018-06-25-05.log"

#### Pass 1


# COUNT0: 247371 #Score
# COUNT1: 1 #timebase constant
# COUNT2: lps #Test case name
# elapsed: 30.003119 #Time consumed
# pid: 16803 #Process ID
# status: 0 #Whether exiting is successful

#### Pass 2


# COUNT0: 242919
# COUNT1: 1
# COUNT2: lps
# elapsed: 30.002898
# pid: 5035
# status: 0

#### Pass 3


# COUNT0: 243989
# COUNT1: 1
# COUNT2: lps
# elapsed: 30.002732
# pid: 21228
# status: 0

*Dump score:     242919.0
Count score:     243989.0
Count score:     247371.0

>>>> Results of 1 copy
>>>> score: 8188.34084738901
>>>> time: 30.0029255
>>>> iterations: 2

The following shows the method to obtain results of COUNT0, COUNT1, and COUNT2.

COUNT|x|y|x

After parsing, it is converted as shown in the following figure.

    COUNT0 = x
    COUNT1 = y
    COUNT2 = z

The score calculation process is as follows:

1. The results are sorted in ascending order from COUNT0 and the last 1/3 of the results are removed.
2. If timebase is used, the time elapsed is always greater than 0, and $product += log($COUNT0) - log(Time elapsed/ $timebase). Otherwise, the time elapsed can be ignored and $product += log($COUNT0).
3. Single performance score: $score = exp($product/2), where 2 indicates the number of iterations. The calculated score is the same as the system-generated score.

The general calculation method is to first use the log to implement dimensionality reduction to narrow the gap between the results obtained through different numbers of iterations as much as possible. Then the factorial operation is applied to the average. This method also applies to multiple performance scores.

Multi-Process Score Calculation

If multiple concurrent processes are used for score calculation, see the following example:

Shell Scripts (1 concurrent) -- 4 copies
==> "/opt/unixbench/UnixBench/pgms/looper" 60 "/opt/unixbench/UnixBench/pgms/multi.sh" 1 2>&1 >> "/opt/unixbench/UnixBench/results/VM_0_13_centos-2018-06-25-05.log"

#### Pass 1


# COUNT0: 4614
# COUNT1: 60
# COUNT2: lpm
# elapsed: 60.005639
# pid: 20858
# status: 0

# COUNT0: 4596
# COUNT1: 60
# COUNT2: lpm
# elapsed: 60.009496
# pid: 20859
# status: 0

# COUNT0: 4592
# COUNT1: 60
# COUNT2: lpm
# elapsed: 60.011761
# pid: 20862
# status: 0

# COUNT0: 4614
# COUNT1: 60
# COUNT2: lpm
# elapsed: 60.011930
# pid: 20864
# status: 0

#### Pass 2


# COUNT0: 4547
# COUNT1: 60
# COUNT2: lpm
# elapsed: 60.005597
# pid: 10791
# status: 0

# COUNT0: 4590
# COUNT1: 60
# COUNT2: lpm
# elapsed: 60.013270
# pid: 10793
# status: 0

# COUNT0: 4578
# COUNT1: 60
# COUNT2: lpm
# elapsed: 60.006054
# pid: 10794
# status: 0

# COUNT0: 4561
# COUNT1: 60
# COUNT2: lpm
# elapsed: 60.014214
# pid: 10797
# status: 0

#### Pass 3


# COUNT0: 4631
# COUNT1: 60
# COUNT2: lpm
# elapsed: 60.013816
# pid: 31734
# status: 0

# COUNT0: 4632
# COUNT1: 60
# COUNT2: lpm
# elapsed: 60.012614
# pid: 31735
# status: 0

# COUNT0: 4637
# COUNT1: 60
# COUNT2: lpm
# elapsed: 60.005633
# pid: 31737
# status: 0

# COUNT0: 4645
# COUNT1: 60
# COUNT2: lpm
# elapsed: 60.006082
# pid: 31740
# status: 0

*Dump score:      18276.0
Count score:      18416.0
Count score:      18545.0

>>>> Sum of 4 copies
>>>> score: 18477.4244713467
>>>> time: 60.009621375
>>>> iterations: 2

The score calculation process is as follows:

Score: For example, the score 18545 is the sum of the scores (4631 + 4632 + 4637 + 4645) of Pass1, Pass2, and Pass3.
Time elapsed: This is an average, that is, (60.013816+60.012614+60.005633+60.006082)/4=60.00953625.
timebase: COUNT1.

The preceding three steps combine concurrent calculation results into one single result. The subsequent calculation is the same as that for score calculation with a single process.

Total Score

Take the score calculation with multiple processes illustrated at the beginning of this article as an example.

Each calculated score is a weighted score and must be divided by 10. (For single item competition, it doesn't matter if the scores are divided by 10. This may aim to make the scores closer to the total score.)
Calculation of the total score: =exp(average(Each score*10)). The expected result is obtained.

Summary

This is the end of the description of score calculation with UnixBench. The tables below illustrate the calculation process of the scores.

Single Process Score

Multi Process Score

Total Score

Community

UnixBench Score: An Introduction

Running Parameters

Score Calculation

Score Calculation with a Single Process

Multi-Process Score Calculation

Total Score

Summary

Read previous post:

Read next post:

Alibaba Cloud ECS

You may also like

Comments

Alibaba Cloud ECS

Related Products

ECS(Elastic Compute Service)

Elastic High Performance Computing Solution

Elastic High Performance Computing

Container Compute Service (ACS)