All Products
Search
Document Center

ApsaraDB RDS:Use the pgvector extension to test performance based on HNSW indexes

Last Updated:Mar 28, 2026

This topic shows how to benchmark ApsaraDB RDS for PostgreSQL vector search performance using Hierarchical Navigable Small Worlds (HNSW) indexes. The test uses the ANN-Benchmarks tool to measure recall rate, queries per second (QPS), and index build time across different parameter combinations. ANN-Benchmarks tests single-threaded performance by default. For concurrency performance testing, see Use the pgvector extension to test performance based on IVF indexes.

Test environment

Place the RDS instance and the Elastic Compute Service (ECS) instance in the same virtual private cloud (VPC) and vSwitch to avoid network-induced variance in test results.

ComponentSpecification
RDS instancePostgreSQL 16, RDS High-availability Edition, pg.x8.2xlarge.2c (16 cores, 128 GB memory), pgvector 0.8.0
ECS instanceecs.c6.xlarge (4 cores, 8 GiB memory), Alibaba Cloud Linux 3
Test toolANN-Benchmarks (single-threaded by default)

Prerequisites

Before you begin, ensure that you have:

Set up the ECS instance

Run the following commands on the ECS instance to download ANN-Benchmarks and create a Python environment:

# Download ANN-Benchmarks
cd ~
git clone https://github.com/erikbern/ann-benchmarks.git

# Install Conda and create the test environment
yum install git
yum install conda
conda create -n ann_test python=3.10.6
conda init bash
source /usr/etc/profile.d/conda.sh
conda activate ann_test

# Install dependencies
cd ~/ann-benchmarks/
pip install -r requirements.txt

All subsequent steps run inside the ann_test virtual environment. If you are logged out due to a timeout, run conda activate ann_test to re-enter the environment.

Run the benchmark

Step 1: Configure the RDS connection

Append the following settings to ~/ann-benchmarks/ann_benchmarks/algorithms/pgvector/module.py, replacing the placeholder values with your actual RDS connection details:

# RDS instance connection settings
os.environ['ANN_BENCHMARKS_PG_USER'] = 'ann_testuser'
os.environ['ANN_BENCHMARKS_PG_PASSWORD'] = 'testPawword'       # Replace with your password
os.environ['ANN_BENCHMARKS_PG_DBNAME'] = 'ann_testdb'
os.environ['ANN_BENCHMARKS_PG_HOST'] = 'pgm-****.pg.rds.aliyuncs.com'  # Replace with your internal endpoint
os.environ['ANN_BENCHMARKS_PG_PORT'] = '5432'
os.environ['ANN_BENCHMARKS_PG_START_SERVICE'] = 'false'        # Disable automatic startup

Step 2: Configure HNSW test parameters

Edit ~/ann-benchmarks/ann_benchmarks/algorithms/pgvector/config.yml to define the parameter combinations to test. This test covers three groups: M-16(100), M-16(200), and M-24(200).

float:
  any:
  - base_args: ['@metric']
    constructor: PGVector
    disabled: false
    docker_tag: ann-benchmarks-pgvector
    module: ann_benchmarks.algorithms.pgvector
    name: pgvector
    run_groups:
      M-16(100):
        arg_groups: [{M: 16, efConstruction: 100}]
        args: {}
        query_args: [[10, 20, 40, 80, 120, 200, 400, 800]]
      M-16(200):
        arg_groups: [{M: 16, efConstruction: 200}]
        args: {}
        query_args: [[10, 20, 40, 80, 120, 200, 400, 800]]
      M-24(200):
        arg_groups: [{M: 24, efConstruction: 200}]
        args: {}
        query_args: [[10, 20, 40, 80, 120, 200, 400, 800]]

Parameter descriptions:

ParameterDescription
M (in arg_groups)Maximum number of neighboring nodes per node at each graph layer. Higher values produce a denser graph, increasing recall rate and index build time.
efConstruction (in arg_groups)Size of the candidate set during index construction. Higher values improve recall rate at the cost of longer build time.
ef_search (in query_args)Size of the candidate set during queries. Higher values improve recall rate and increase query latency.

Tuning direction at a glance:

  • To increase recall rate: raise M, efConstruction, or ef_search.

  • To reduce query latency: lower ef_search.

  • To shorten index build time: lower efConstruction or increase parallel workers (see Appendix: Parameter impact on index build time).

  • Default values (M=16, ef_construction=64, ef_search=40) often produce suboptimal recall. Tune from these defaults based on your requirements.

Step 3: Build the test Docker image

  1. (Optional) Modify ~/ann-benchmarks/ann_benchmarks/algorithms/pgvector/Dockerfile to skip the built-in PostgreSQL setup and use only the psycopg and pgvector packages:

    FROM ann-benchmarks
    USER root
    RUN pip install psycopg[binary] pgvector
  2. Build the Docker image:

    cd ~/ann-benchmarks/
    python install.py --algorithm pgvector

    Run python install.py --help to see all available options.

Step 4: Download the dataset

When you run the test script, ANN-Benchmarks automatically downloads the specified dataset. This test uses the nytimes-256-angular dataset. For details on available datasets, see ANN-Benchmarks datasets.

DatasetDimensionsRowsTest vectorsNearest neighborsDistance
NYTimes256290,00010,000100Angular

If the public datasets do not match your workload, convert your vector data to Hierarchical Data Format version 5 (HDF5) format and use it as a custom dataset. See Custom datasets.

Step 5: Run the test and collect results

  1. Run the benchmark:

    ParameterDescription
    --datasetDataset to test
    -kLIMIT value in the SQL query — number of results to return
    --algorithmAlgorithm to benchmark (pgvector in this test)
    --runsNumber of runs; the best result set is selected
    --parallelismNumber of parallel workers (default: 1)
    cd ~/ann-benchmarks
    nohup python run.py --dataset nytimes-256-angular -k 10 --algorithm pgvector --runs 1 > ann_benchmark_test.log 2>&1 &
    tail -f ann_benchmark_test.log
  2. Generate result plots:

    cd ~/ann-benchmarks
    python plot.py --dataset nytimes-256-angular --recompute
  3. (Optional) Export detailed results including recall rate, QPS, response time (RT), index build time, and index size:

    cd ~/ann-benchmarks
    python data_export.py --out res.csv

Test results

Recall rate and QPS

The following table shows results from the nytimes-256-angular dataset. All three groups use the same query_args values (ef_search = 10, 20, 40, 80, 120, 200, 400, 800).

mef_constructionef_searchRecall rateQPS
16100100.6301423.985
16100200.7411131.941
16100400.820836.017
16100800.871574.733
161001200.894440.076
161002000.918297.267
161004000.945162.759
161008000.96984.268
16200100.6831299.667
16200200.7811094.968
16200400.849790.838
16200800.895533.826
162001200.914405.975
162002000.933272.591
162004000.956148.688
162008000.97776.555
24200100.7671182.747
24200200.840922.770
24200400.887639.899
24200800.920411.140
242001200.936303.323
242002000.953199.752
242004000.973105.506
242008000.98853.904

Index build time

mef_constructionBuild time (seconds)
1610033.35
1620057.66
2420087.23

Conclusions

Increasing m, efConstruction, and ef_search consistently improves recall rate at the cost of lower QPS and longer build time. Specifically:

  • Raising ef_search improves recall rate but reduces QPS.

  • Raising m and efConstruction improves recall rate, reduces QPS, and extends build time.

  • If your application requires high recall, avoid the default parameter values (m=16, ef_construction=64, ef_search=40), which are optimized for speed rather than accuracy.

Appendix: Parameter impact on index build time

Effect of maintenance_work_mem

maintenance_work_mem sets the maximum memory for maintenance operations such as VACUUM and CREATE INDEX (unit: KB). Increasing this value shortens build time — but only up to the size of the dataset. Once maintenance_work_mem exceeds the dataset size, build time stops improving.

The following results use the pg.x8.2xlarge.2c instance type (16 cores, 128 GB memory), max_parallel_maintenance_workers=8, and the nytimes-256-angular dataset (~324 MB).

maintenance_work_memBuild time (seconds)
64 MB (65,536 KB)52.82
128 MB (131,072 KB)46.79
256 MB (262,144 KB)36.40
512 MB (524,288 KB)18.90
1 GB (1,048,576 KB)19.06

Build time plateaus between 512 MB and 1 GB because both values exceed the ~324 MB dataset size.

Effect of max_parallel_maintenance_workers

max_parallel_maintenance_workers controls the number of parallel workers for a single CREATE INDEX operation. Build time decreases as this value increases.

The following results use the same instance type, maintenance_work_mem=2048 (2 GB), and the nytimes-256-angular dataset.

max_parallel_maintenance_workersBuild time (seconds)
176.00
251.34
432.49
819.66
1214.44
1613.07
2413.15

Effect of vector dimensions

The following results use the GloVe dataset (1,183,514 rows), m=16, efConstruction=64, and ef_search=40, with maintenance_work_mem=8 GB and max_parallel_maintenance_workers=16.

As vector dimension increases, index build time increases, recall rate decreases, QPS decreases, and query latency increases.

DimensionBuild time (seconds)Recall rateQPSP99 (ms)
25195.100.99985192.947.84
50236.920.99647152.369.69
100319.360.97231126.8911.14
200529.330.9318695.0515.11
P99 is the 99th percentile latency — 99% of all query requests complete within this time.

Effect of dataset size

The following results use the dbpedia-openai-{n}k-angular dataset with m=48, efConstruction=256, and ef_search=200.

As the number of rows increases, index build time increases nonlinearly, recall rate decreases, QPS decreases, and query latency increases.

Dataset sizeRows (10,000s)Build time (seconds)Recall rateQPSP99 (ms)
100k1054.050.9993171.748.93
200k20137.230.99901146.7810.81
500k50436.680.999118.5513.94
1,000k100957.260.99879101.6016.35

Custom datasets

If the public datasets do not represent your workload, generate a custom HDF5 dataset from your own vector data.

This example requires the rds_ai extension. For installation, see Use the AI capabilities provided by the rds_ai extension.
  1. Run the following script to export your vector data and ground-truth neighbors to an HDF5 file:

    import h5py
    import numpy as np
    import psycopg2
    import pgvector.psycopg2
    
    conn_info = {
        'host': 'pgm-****.rds.aliyuncs.com',
        'user': 'ann_testuser',
        'password': '****',
        'port': '5432',
        'dbname': 'ann_testdb'
    }
    
    embedding_len = 1024
    distance_top_n = 100
    query_batch_size = 100
    
    try:
        with psycopg2.connect(**conn_info) as connection:
            pgvector.psycopg2.register_vector(connection)
            with connection.cursor() as cur:
                # Fetch training vectors
                cur.execute("select count(1) from test_rag")
                count = cur.fetchone()[0]
    
                train_embeddings = []
                for start in range(0, count, query_batch_size):
                    query = f"SELECT embedding FROM test_rag ORDER BY id OFFSET {start} LIMIT {query_batch_size}"
                    cur.execute(query)
                    res = [embedding[0] for embedding in cur.fetchall()]
                    train_embeddings.extend(res)
                train = np.array(train_embeddings)
    
                # Generate query embeddings using rds_ai
                with open('query.txt', 'r', encoding='utf-8') as file:
                    queries = [query.strip() for query in file]
                test = []
                for query in queries:
                    cur.execute(f"SELECT rds_ai.embed('{query.strip()}')::vector(1024)")
                    test.extend([cur.fetchone()[0]])
                test = np.array(test)
    
        # Compute ground-truth nearest neighbors using angular distance
        dot_product = np.dot(test, train.T)
        norm_test = np.linalg.norm(test, axis=1, keepdims=True)
        norm_train = np.linalg.norm(train, axis=1, keepdims=True)
        similarity = dot_product / (norm_test * norm_train.T)
        distance_matrix = 1 - similarity
    
        neighbors = np.argsort(distance_matrix, axis=1)[:, :distance_top_n]
        distances = np.take_along_axis(distance_matrix, neighbors, axis=1)
    
        # Save to HDF5
        with h5py.File('custom_dataset.hdf5', 'w') as f:
            f.create_dataset('distances', data=distances)
            f.create_dataset('neighbors', data=neighbors)
            f.create_dataset('test', data=test)
            f.create_dataset('train', data=train)
            f.attrs.update({
                "type": "dense",
                "distance": "angular",
                "dimension": embedding_len,
                "point_type": "float"
            })
    
        print("The HDF5 file is created and the dataset is added.")
    
    except (Exception, psycopg2.DatabaseError) as error:
        print(f"Error: {error}")
  2. Register the custom dataset in the DATASETS section of ~/ann-benchmarks/ann_benchmarks/datasets.py:

    DATASETS: Dict[str, Callable[[str], None]] = {
      ......,
      "<custom_dataset>": None,
    }
  3. Upload custom_dataset.hdf5 to the ~/ann-benchmarks directory and pass its name to run.py with --dataset <custom_dataset>.

What's next