All Products
Search
Document Center

AnalyticDB:Performance tests

Last Updated:Jan 20, 2025

This topic describes the results of performance tests, including index creation, single-term query, and multi-term query, based on the data of the Wikipedia language database.

Specifications of the test instance

  • Engine version: AnalyticDB for PostgreSQL V7.0.

  • Minor version: V7.1.0.0.

  • Edition: High-availability Edition.

  • Node specifications: 4 cores and 32 GB memory.

  • Number of compute nodes: 4.

  • Node storage capacity: 200 GB.

  • Disk type: PL1 Enterprise SSD (ESSD).

  • Coordinator node resources: 8 compute units (CUs).

Test dataset

The test data is taken from a snapshot of the Wikipedia language database and contains 5 million rows of data. Download the packages "wiki-articles-500w-aa" to "wiki-articles-500w-az" from softhuafei/wiki-dataset on GitHub. In this example, the test data is successively imported 20 times to create a dataset that contains 0.1 billion rows of data.

  • The dataset contains the url, title, and body fields.

  • The dataset is in the JSON format.

  • The dataset is 2.34 GB in size.

Note

You can run the cat wiki-articles-500w* > wiki-articles-500w.json.bz2 command to merge the test data files, and then run the bzip2 -dk wiki-articles-500w.json.bz2 command to decompress the package.

Performance test results

  • The amount of time required to create a BM25 index is approximately 63% of the amount of time required to create a GIN(tsvector) index.

  • In terms of single-term queries, the queries per second (QPS) performance of a BM25 index is six to seven times the QPS performance of a GIN(tsvector) index.

  • In terms of multi-term queries, the QPS performance of a BM25 index is six to eleven times the QPS performance of a GIN(tsvector) index.

  • In terms of single-term AND point queries, the QPS performance of a BM25 index is six to nine times the QPS performance of GIN(tsvector) and B-tree indexes.

  • In terms of single-term AND range queries, the QPS performance of a BM25 index is one to nine times the QPS performance of GIN(tsvector) and B-tree indexes.

Create an index

Index

Amount of time required to import data

Amount of time required to create an index

Amount of time required for stream importing (with indexes)

BM25

24 minutes

36 minutes

84 minutes

GIN(tsvector)

137 minutes

57 minutes

N/A

Perform a single-term query

Query the top 20 results that contain a term in the body column. Example: body:Gabriel Basso. Among the top 20 results, the average occurrence of the term is 56,685. The following table describes the maximum QPS values and the corresponding response time (RT) values.

Concurrency

BM25

GIN(tsquery)

QPS

RT (ms)

QPS

RT (ms)

1

158.86

6.45

25.97

38.35

5

544.43

9.19

73.92

64.35

10

738.21

12.63

89.90

128.65

20

863.07

20.69

172.83

262.89

30

925.02

31.05

239.74

378.18

40

1,234.45

39.64

203.91

616.93

Perform a multi-term query

Query the top 20 results that contain one or more terms in the body column. Example: body:handsworth OR body:cern. All terms are extracted from the test dataset. The terms whose occurrence is less than 100 are not included. The number of terms contained in each query conforms to the weight distribution parameter [0.03, 0.15, 0.25, 0.25, 0.15, 0.08, 0.04, 0.03, 0.02]. In this case, 3% of queries contain one term and 15% of queries contain two terms.

Concurrency

BM25

GIN(tsvector)

QPS

RT (ms)

QPS

RT (ms)

1

216.88

4.54

31.97

31.49

5

842.12

5.86

104.89

47.27

10

1,161.79

8.58

112.89

87.40

20

1,332.61

15.03

118.87

159.25

30

1,380.56

21.66

120.93

264.10

40

1,470.47

26.73

131.87

341.69

Perform a single-term AND point query

Only a single row of data is matched based on the point query conditions.

SELECT * FROM articles ORDER BY body @@@ pgsearch.config('body:"David" AND id:80000000');
SELECT * FROM articles_gin WHERE  body_ts @@ to_tsquery('David') AND id = '80000000';

Concurrency

BM25

GIN(tsvector) and B-tree

QPS

RT (ms)

QPS

RT (ms)

1

342.92

2.81

2.00

985.66

5

1,434.90

3.30

4.99

2,401.95

10

2,293.54

4.21

6.99

4,306.61

20

3,084.70

6.32

13.98

8,536.26

30

2,835.97

10.38

19.98

12,565.44

40

3,081.71

12.78

24.97

17,342.22

Perform a single-term AND range query

Ten thousand rows of data are matched and 99.99% of data is filtered out based on the range query conditions.

SELECT * FROM articles WHERE body @@ pgsearch.config('body:"David" AND id:[110000 TO 120000]');
SELECT * FROM articles_gin WHERE  body_ts @@ to_tsquery('David') AND id >= '110000' AND id <= '120000';

Concurrency

BM25

GIN(tsvector) and B-tree

QPS

RT (ms)

QPS

RT (ms)

1

9.99

110.55

1.00

1,189.95

5

22.97

229.57

4.99

2,074.63

10

25.97

451.81

8.99

4,236.33

20

32.97

951.86

14.99

8,674.06

30

29.97

1,366.22

19.98

13,458.88

40

39.97

2,071.66

32.96

18,251.81