This topic presents a performance benchmark of the pgvector extension with an HNSW index on an ApsaraDB RDS for PostgreSQL instance. The benchmark uses the ann-benchmarks tool to evaluate the performance of the RDS implementation compared to the community version across key metrics such as recall rate, queries per second (QPS), and index build time.
Test environment
To avoid inaccuracies due to network fluctuations, place your RDS PostgreSQL instance and client ECS instance in the same Virtual Private Cloud (VPC) and VSwitch.
|
Component |
Description |
|
RDS PostgreSQL instance |
|
|
ECS instance |
|
|
Test tool |
Important
By default, ANN-Benchmarks tests single-threaded performance. To test the concurrent performance of the instance, see pgvector performance test (based on an IVF index). |
Prerequisites
RDS for PostgreSQL instance
-
Create a high-privilege account named
ann_testuserand a test database namedann_testdb. For more information, see Create accounts and databases. -
Install the pgvector plugin (vector) in the
ann_testdbdatabase. For more information, see Manage plugins.
Client ECS instance
-
Install Docker. For more information, see Install Docker.
-
Run the following commands to download the ann-benchmarks tool.
cd ~ git clone https://github.com/erikbern/ann-benchmarks.git -
Run the following commands to create and activate a Python 3.10.6 virtual environment named
ann_testusing Conda.yum install git yum install conda conda create -n ann_test python=3.10.6 conda init bash source /usr/etc/profile.d/conda.sh conda activate ann_test -
Run the following commands to install the ann-benchmarks dependencies.
cd ~/ann-benchmarks/ pip install -r requirements.txt
Test procedure
All steps in this procedure are performed in the ann_test virtual environment. If the session times out or exits for any reason, use the command conda activate ann_test to re-enter the environment.
Step 1: Configure connection settings
Edit the ~/ann-benchmarks/ann_benchmarks/algorithms/pgvector/module.py file in the benchmark tool to add the following connection settings. Fill in the values based on your actual environment.
# Set the connection parameters for PostgreSQL
os.environ['ANN_BENCHMARKS_PG_USER'] = 'ann_testuser' # The user for the RDS PostgreSQL instance
os.environ['ANN_BENCHMARKS_PG_PASSWORD'] = 'testPawword' # The password for the RDS PostgreSQL user
os.environ['ANN_BENCHMARKS_PG_DBNAME'] = 'ann_testdb' # The database name of the RDS PostgreSQL instance
os.environ['ANN_BENCHMARKS_PG_HOST'] = 'pgm-****.pg.rds.aliyuncs.com' # The internal endpoint of the RDS PostgreSQL instance
os.environ['ANN_BENCHMARKS_PG_PORT'] = '5432' # The internal port number of the RDS PostgreSQL instance
os.environ['ANN_BENCHMARKS_PG_START_SERVICE'] = 'false' # Disable automatic service startup
Step 2: Configure ann-benchmarks test parameters
Based on your test requirements, edit the ~/ann-benchmarks/ann_benchmarks/algorithms/pgvector/config.yml file in the benchmark tool. For example:
float:
any:
- base_args: ['@metric']
constructor: PGVector
disabled: false
docker_tag: ann-benchmarks-pgvector
module: ann_benchmarks.algorithms.pgvector
name: pgvector
run_groups:
M-16(100):
arg_groups: [{M: 16, efConstruction: 100}]
args: {}
query_args: [[10, 20, 40, 80, 120, 200, 400, 800]]
M-16(200):
arg_groups: [{M: 16, efConstruction: 200}]
args: {}
query_args: [[10, 20, 40, 80, 120, 200, 400, 800]]
M-24(200):
arg_groups: [{M: 24, efConstruction: 200}]
args: {}
query_args: [[10, 20, 40, 80, 120, 200, 400, 800]]
This test is divided into three groups: M-16(100), M-16(200), and M-24(200). Each test group uses arg_groups to configure the parameters for creating the HNSW index and query_args to configure the retrieval parameters.
|
Parameter |
Description |
|
|
arg_groups |
M |
Corresponds to the A larger value increases the graph's density, which generally improves recall but also increases index build and query time. |
|
efConstruction |
Corresponds to the A larger value generally improves recall but also increases index build and query time. |
|
|
query_args |
ef_search |
A query-time setting that specifies the size of the candidate set maintained during the search process. A larger value generally improves recall but also increases query time. |
Step 3: Build the test Docker image
-
(Optional) To skip testing the community version of PostgreSQL (the default), change the
~/ann-benchmarks/ann_benchmarks/algorithms/pgvector/Dockerfilefile as follows:FROM ann-benchmarks USER root RUN pip install psycopg[binary] pgvector -
Run install.py with
--algorithm pgvectorto build the test Docker image.cd ~/ann-benchmarks/ python install.py --algorithm pgvectorNoteTo view supported parameters, run
python install.py --help.
Step 4: Get the dataset
The benchmark script automatically downloads the specified public dataset.
This topic uses the nytimes-256-angular dataset, which uses the Angular distance type, as an example of text similarity search. For more public datasets, see ann-benchmarks.
|
Dataset |
Dimension |
Number of rows |
Test vectors |
Top-N neighbors |
Distance type |
|
NYTimes |
256 |
290,000 |
10,000 |
100 |
Angular |
If the public dataset does not meet your testing needs, we recommend converting your actual vector data into a standard format like HDF5 to test retrieval performance. For more information, see Appendix II: Custom Test Datasets.
Step 5: Run test and get results
-
Use the following commands to run the benchmark script.
cd ~/ann-benchmarks nohup python run.py --dataset nytimes-256-angular -k 10 --algorithm pgvector --runs 1 > ann_benchmark_test.log 2>&1 & tail -f ann_benchmark_test.logParameter
Description
--dataset
Specifies the dataset to test.
--k
The LIMIT value in the query SQL statement, which is the number of results to return.
--algorithm
The vector database algorithm to test. Set to pgvector for this test.
--runs
The number of test runs. The best result from the runs is selected.
--parallelism
The test concurrency. The default is 1.
-
Run the following command to generate a plot of the test results.
cd ~/ann-benchmarks python plot.py --dataset nytimes-256-angular --recomputeFocus on the index build metrics and the performance gains at the same recall.
Index build comparison:
Index type
Creation time (s)
Index size
HNSW community version (v0.8.0)
127.89
7820MB
HNSW RDS version (v0.8.0.1)
77.72
3916MB
QPS comparison at the same recall:
Recall
v0.8.0 HNSW parameters
v0.8.0.1 HNSW parameters
v0.8.0 QPS
v0.8.0.1 QPS
97%
ef_search=10
ef_search=10
1635.41
1954.48
98%
ef_search=30
ef_search=20
1075.49
1379.61
99.3%
ef_search=120
ef_search=80
434.74
737.06
99.5%
ef_search=300
ef_search=350
200.06
319.19
99.6%
ef_search=500
ef_search=500
109.51
177.94
-
(Optional) Run the following command to export the detailed test results to a CSV file.
cd ~/ann-benchmarks python data_export.py --out res.csvFor example, you can analyze the relationship between the
m,ef_constructionparameters and the index build time:m
ef_construction
Build time (s)
16
100
33.35161
16
200
57.66014
24
200
87.22608
Conclusions
When you build an HNSW index:
-
Increasing
m,ef_construction, andef_searchcan improve the recall rate, but will lower QPS. -
Increasing
mandef_constructionalso increases the recall rate, while also decreasing QPS and increasing the index build time. -
If your application requires a high recall rate, avoid the default index parameters (m=16, ef_construction=64, and ef_search=40).
Appendix I: RDS for PostgreSQL: Parameter impact on index building
This appendix uses test results from ann-benchmarks to show how various RDS for PostgreSQL and vector parameters affect index building.
RDS for PostgreSQL: Parameter impact on index building
For example, when the ann-benchmarks test parameters are set to m=16 and efConstruction=64, the RDS for PostgreSQL parameters affect index building as follows.
-
maintenance_work_mem
This parameter specifies the maximum memory in kilobytes (KB) for maintenance operations, such as
VACUUMandCREATE INDEX. When themaintenance_work_memvalue is less than the dataset size, increasing it reduces the index build time. However, once the value exceeds the dataset size, the build time no longer decreases.For example, on an RDS for PostgreSQL instance of type pg.x8.2xlarge.2c (16 cores, 128 GB), with the
max_parallel_maintenance_workersparameter set to its default value of 8 and using the nytimes-256-angular dataset (approximately 324 MB), themaintenance_work_memparameter affects index building as follows:maintenance_work_mem
Index build time (s)
64 MB (65536 KB)
52.82
128 MB (131072 KB)
46.79
256 MB (262144 KB)
36.40
512 MB (524288 KB)
18.90
1 GB (1048576 KB)
19.06
-
max_parallel_maintenance_workers
This parameter sets the maximum number of parallel workers for
CREATE INDEX. The index build time decreases asmax_parallel_maintenance_workersincreases.For example, on an RDS for PostgreSQL instance of type pg.x8.2xlarge.2c (16 cores, 128 GB), with the
maintenance_work_memparameter set to 2 GB (2048 MB) and using the nytimes-256-angular dataset (approximately 324 MB), themax_parallel_maintenance_workersparameter affects index building as follows:max_parallel_maintenance_workers
Index build time (s)
1
76.00
2
51.34
4
32.49
8
19.66
12
14.44
16
13.07
24
13.15
Vector parameter impact on index building
With the RDS for PostgreSQL parameters set to maintenance_work_mem=8 GB (8388608 KB) and max_parallel_maintenance_workers=16, the following sections show the impact of vector parameters on index building.
-
Vector dimension
The following tests use the GloVe dataset (1,183,514 vectors) with ann-benchmarks test parameters set to
m=16,efConstruction=64, andef_search=40. The results show that as the vector dimension increases, the index build time and query latency also increase, while both recall and QPS decrease.Dimension
Build time (s)
Recall
QPS
p99 (ms)
25
195.10
0.99985
192.94
7.84
50
236.92
0.99647
152.36
9.69
100
319.36
0.97231
126.89
11.14
200
529.33
0.93186
95.05
15.11
NoteThe 99th percentile of query latency. This means 99% of query requests are completed in less time than this value, and only 1% take longer.
-
Number of vectors
These tests use the
dbpedia-openai-{n}k-angulardataset series, wherenrepresents the number of vectors in thousands (from 100 to 1,000). When the ann-benchmarks test parameters are set tom=48,efConstruction=256, andef_search=200, the test results are as follows. The results show that the index build time increases non-linearly as the number of vectors grows, while both recall and QPS decrease and query latency increases.Number of vectors
Number of rows (x10,000)
Build time (s)
Recall
QPS
p99 (ms)
100
10
54.05
0.9993
171.74
8.93
200
20
137.23
0.99901
146.78
10.81
500
50
436.68
0.999
118.55
13.94
1000
100
957.26
0.99879
101.60
16.35
Appendix II: Custom test dataset
The following example shows how to create a dataset based on the nytimes-256-angular data format.
-
Run the following script to create a custom test dataset.
NoteThis example requires the rds_ai extension. For more information, see AI (rds_ai).
import h5py import numpy as np import psycopg2 import pgvector.psycopg2 # Connection details conn_info = { 'host': 'pgm-****.rds.aliyuncs.com', 'user': 'ann_testuser', 'password': '****', 'port': '5432', 'dbname': 'ann_testdb' } embedding_len = 1024 distance_top_n = 100 query_batch_size = 100 try: # Connect to the RDS for PostgreSQL database with psycopg2.connect(**conn_info) as connection: pgvector.psycopg2.register_vector(connection) with connection.cursor() as cur: # Fetch the vector data cur.execute("select count(1) from test_rag") count = cur.fetchone()[0] train_embeddings = [] for start in range(0, count, query_batch_size): query = f"SELECT embedding FROM test_rag ORDER BY id OFFSET {start} LIMIT {query_batch_size}" cur.execute(query) res = [embedding[0] for embedding in cur.fetchall()] train_embeddings.extend(res) train = np.array(train_embeddings) # Get query data and compute embeddings with open('query.txt', 'r', encoding='utf-8') as file: queries = [query.strip() for query in file] test = [] # Install the rds_ai extension or use the Alibaba Cloud Model Studio SDK for query in queries: cur.execute(f"SELECT rds_ai.embed('{query.strip()}')::vector(1024)") test.extend([cur.fetchone()[0]]) test = np.array(test) # Calculate the top N nearest neighbor distances dot_product = np.dot(test, train.T) norm_test = np.linalg.norm(test, axis=1, keepdims=True) norm_train = np.linalg.norm(train, axis=1, keepdims=True) similarity = dot_product / (norm_test * norm_train.T) distance_matrix = 1 - similarity neighbors = np.argsort(distance_matrix, axis=1)[:, :distance_top_n] distances = np.take_along_axis(distance_matrix, neighbors, axis=1) with h5py.File('custom_dataset.hdf5', 'w') as f: f.create_dataset('distances', data=distances) f.create_dataset('neighbors', data=neighbors) f.create_dataset('test', data=test) f.create_dataset('train', data=train) f.attrs.update({ "type": "dense", "distance": "angular", "dimension": embedding_len, "point_type": "float" }) print("The HDF5 file was created successfully and the dataset was added.") except (Exception, psycopg2.DatabaseError) as error: print(f"Error: {error}") -
Register your custom test dataset in the
DATASETSsection of the~/ann-benchmarks/ann_benchmarks/datasets.pyfile.DATASETS: Dict[str, Callable[[str], None]] = { ......, "<custom_dataset>": None, } -
Upload the custom test dataset to the
~/ann-benchmarksdirectory. You can then use the dataset with therun.pytest script.