All Products
Search
Document Center

ApsaraDB RDS:Use the pgvector extension to test performance based on HNSW indexes

Last Updated:Mar 06, 2025

This topic describes how to use the pgvector extension to test the performance of an ApsaraDB RDS for PostgreSQL instance based on Hierarchical Navigable Small Worlds (HNSW) indexes. The pgvector extension uses the ANN-Benchmarks tool to evaluate key metrics, such as the recall rate, queries per second (QPS), and index creation time.

Test environment

An RDS instance and an Elastic Compute Service (ECS) instance must reside in the same virtual private cloud (VPC) and belong to the same vSwitch to prevent errors caused by network fluctuations.

Test instance and test tool

Description

RDS instance specifications

  • The RDS instance runs PostgreSQL 16.

  • The RDS instance is a standard RDS instance that runs RDS High-availability Edition and uses the pg.x8.2xlarge.2c dedicated instance type. The instance type provides 16 cores and 128 GB of memory.

  • The pgvector extension runs version 0.8.0.

ECS instance specifications

  • The ECS instance uses the ecs.c6.xlarge instance type, which provides 4 cores and 8 GiB of memory.

  • The ECS instance runs Alibaba Cloud Linux 3.

Test tool

ANN-Benchmarks

Important

By default, the ANN-Benchmarks tool tests single-threaded performance. If you want to test the concurrency performance, you can follow the instructions provided in Use the pgvector extension to test performance based on IVF indexes.

Preparations

RDS instance

  1. Create a privileged account named ann_testuser and a database named ann_testdb. For more information, see Create a database and an account.

  2. Install the pgvector extension in the ann_testdb database. The pgvector extension is named vector in the system. For more information, see Manage extensions.

ECS instance

  1. Install Docker. For more information, see Install Docker.

  2. Run the following commands to download the ANN-Benchmarks tool:

    cd ~
    git clone https://github.com/erikbern/ann-benchmarks.git
  3. Run the following commands to create and activate a virtual environment named ann_test by using Conda and install Python 3.10.6:

    yum install git
    yum install conda
    conda create -n ann_test python=3.10.6
    conda init bash
    source /usr/etc/profile.d/conda.sh 
    conda activate ann_test
  4. Run the following commands to install the dependencies of the ANN-Benchmarks tool:

    cd ~/ann-benchmarks/
    pip install -r requirements.txt

Test procedure

Important

All steps in the test procedure are performed in the ann_test virtual environment. If you are forced to log out due to errors such as a timeout, you can run the conda activate ann_test command to log on to the environment again.

Step 1: Configure connection information for the test tool

Append the following connection settings to the ~/ann-benchmarks/ann_benchmarks/algorithms/pgvector/module.py file of the test tool and configure the settings based on your business requirements:

# Configure the parameters to connect to the RDS instance.
os.environ['ANN_BENCHMARKS_PG_USER'] = 'ann_testuser'     # Specifies the username of the account that is used to connect to the RDS instance.
os.environ['ANN_BENCHMARKS_PG_PASSWORD'] = 'testPawword'  # Specifies the password of the account that is used to connect to the RDS instance.
os.environ['ANN_BENCHMARKS_PG_DBNAME'] = 'ann_testdb'     # Specifies the name of the required database on the RDS instance.
os.environ['ANN_BENCHMARKS_PG_HOST'] = 'pgm-****.pg.rds.aliyuncs.com'  # Specifies the internal endpoint of the RDS instance.
os.environ['ANN_BENCHMARKS_PG_PORT'] = '5432'             # Specifies the internal port of the RDS instance.
os.environ['ANN_BENCHMARKS_PG_START_SERVICE'] = 'false'   # Disables automatic startup.

Step 2: Configure test parameters for the ANN-Benchmarks tool

Modify the ~/ann-benchmarks/ann_benchmarks/algorithms/pgvector/config.yml file of the test tool based on your business requirements. Examples:

float:
  any:
  - base_args: ['@metric']
    constructor: PGVector
    disabled: false
    docker_tag: ann-benchmarks-pgvector
    module: ann_benchmarks.algorithms.pgvector
    name: pgvector
    run_groups:
      M-16(100):
        arg_groups: [{M: 16, efConstruction: 100}]
        args: {}
        query_args: [[10, 20, 40, 80, 120, 200, 400, 800]]
      M-16(200):
        arg_groups: [{M: 16, efConstruction: 200}]
        args: {}
        query_args: [[10, 20, 40, 80, 120, 200, 400, 800]]
      M-24(200):
        arg_groups: [{M: 24, efConstruction: 200}]
        args: {}
        query_args: [[10, 20, 40, 80, 120, 200, 400, 800]]                                                            

This test is performed on the following groups: M-16(100), M-16(200), and M-24(200). In the test for each group, arg_groups is used to create relevant parameters for the HNSW index, and query_args is used to retrieve the relevant parameters.

Parameter

Description

arg_groups

M

Corresponds to Parameter M in the HNSW index, which specifies the maximum number of neighboring nodes of each node at each layer when you create the HNSW index.

A large value indicates a high graph density. In most cases, a high graph density increases the recall rate and extends time required to create and query the index.

efConstruction

Corresponds to ef_construction in the HNSW index, which specifies the size of the candidate set when you create the HNSW index. The size of a candidate set specifies the number of candidate nodes retained to achieve optimal connections.

In most cases, a large value increases the recall rate and extends the time required to create and query the index.

query_args

ef_search

Specifies the size of the candidate set that is maintained during the query.

In most cases, a large value increases the recall rate and extends the time reuqired to query the index.

Step 3: Create a test Docker image

  1. Optional. Modify the ~/ann-benchmarks/ann_benchmarks/algorithms/pgvector/Dockerfile file of the test tool based on the following information to skip the test on the PostgreSQL community that is automatically configured for the tool:

    FROM ann-benchmarks
    USER root
    RUN pip install psycopg[binary] pgvector
  2. Specify --algorithm pgvector and run the install.py script to create a test Docker image.

    cd ~/ann-benchmarks/
    python install.py --algorithm pgvector
    Note

    You can run the python install.py --help command to view the supported configuration parameters.

Step 4: Obtain a dataset

When you run a test script, the specified public test dataset is automatically downloaded.

In this topic, the nytimes-256-angular dataset that uses the Angular distance formula is used for similarity retrieval. For more information, see ann-benchmarks.

Dataset

Dimension

Number of data rows

Number of test vectors

Top N nearest neighbors

Distance formula

NYTimes

256

290,000

10,000

100

Angular

Note

If the public dataset does meet the test requirements, we recommend that you convert the vector data to a standard format, such as Hierarchical Data Format version 5 (HDF5), to test retrieval performance. For more information, see Appendix 2: Custom test datasets.

Step 5: Perform the test and obtain the results

  1. Run the following commands to execute the test script:

    cd ~/ann-benchmarks
    nohup python run.py --dataset nytimes-256-angular -k 10  --algorithm pgvector --runs 1 > ann_benchmark_test.log 2>&1 &
    tail -f ann_benchmark_test.log

    Parameter

    Description

    --dataset

    The dataset that you want to test.

    --k

    The LIMIT value of the SQL statement, which specifies the number of records to return.

    --algorithm

    The algorithm for the tested vector database. In this example, the parameter is set to pgvector.

    --runs

    The number of runs of the test from which the optimal result set is selected.

    --parallelism

    The parallelism. Default value: 1.

  2. Run the following commands to obtain the test result:

    cd ~/ann-benchmarks
    python plot.py --dataset nytimes-256-angular --recompute
    Note

    The following table describes the test result.

    m

    ef_construction

    ef_search

    Recall rate

    QPS

    16

    100

    10

    0.630

    1423.985

    20

    0.741

    1131.941

    40

    0.820

    836.017

    80

    0.871

    574.733

    120

    0.894

    440.076

    200

    0.918

    297.267

    400

    0.945

    162.759

    800

    0.969

    84.268

    16

    200

    10

    0.683

    1299.667

    20

    0.781

    1094.968

    40

    0.849

    790.838

    80

    0.895

    533.826

    120

    0.914

    405.975

    200

    0.933

    272.591

    400

    0.956

    148.688

    800

    0.977

    76.555

    24

    200

    10

    0.767

    1182.747

    20

    0.840

    922.770

    40

    0.887

    639.899

    80

    0.920

    411.140

    120

    0.936

    303.323

    200

    0.953

    199.752

    400

    0.973

    105.506

    800

    0.988

    53.904

  3. Optional. Run the following commands to obtain test result details, including the recall rate, QPS, response time (RT), index creation time, and index size.

    cd ~/ann-benchmarks
    python data_export.py --out res.csv

    For example, you can query the relationship between m, ef_construction, and index creation time.

    m

    ef_construction

    build (Index creation time in seconds)

    16

    100

    33.35161

    16

    200

    57.66014

    24

    200

    87.22608

Test conclusions

When you create an HNSW index, the following conclusions can be obtained:

  • If you increase the values of the m, ef_construction, and ef_search parameters, the recall rate increases but the QPS decreases.

  • If you increase the values of the m and ef_construction parameters, the recall rate increases, the QPS decreases, and the index creation time is extended.

  • If you have high requirements for the recall rate, we recommend that you do not use the default values of the index parameters. The default values of the m, ef_construction, and ef_search parameters are 16, 64, and 40, respectively.

Appendix 1: Impacts of the parameter configurations of an RDS instance and vectors on index creation

You can configure different parameter values for the ANN-Benchmarks tool, perform the test, and then analyze the test results to determine the impacts of the parameter configurations of an RDS instance and vectors on index creation.

Impacts of the parameter configurations of an RDS instance on index creation

If you set the m parameter to 16 and the efConstruction parameter to 64 for the ANN-Benchmarks tool, the parameter configurations of the RDS instance have the following impacts on index creation:

  • maintenance_work_mem

    This parameter specifies the maximum amount of memory that can be used for maintenance operations, such as VACUUM and CREATE INDEX. Unit: KB. If the value of the maintenance_work_mem parameter is less than the size of the test data, you can increase the value to shorten the time required to create an index. If the value of the maintenance_work_mem parameter exceeds the size of the test data, the index creation time does not decrease.

    For example, your RDS instance uses the pg.x8.2xlarge.2c instance type that provides 16 cores and 128 GB of memory, you set the max_parallel_maintenance_workers parameter to the default value 8, and the nytimes-256-angular dataset contains approximately 324 MB of data. In this case, the configuration of the maintenance_work_mem parameter has the following impacts on index creation.

    maintenance_work_mem

    Index creation time (unit: seconds)

    64 MB (65,536 KB)

    52.82

    128 MB (131,072 KB)

    46.79

    256 MB (262,144 KB)

    36.40

    512 MB (524,288 KB)

    18.90

    1 GB (1,048,576 KB)

    19.06

  • max_parallel_maintenance_workers

    This parameter specifies the maximum number of parallel workers that can be started by a single CREATE INDEX operation. The index creation time decreases as the value of the max_parallel_maintenance_workers parameter increases.

    For example, your RDS instance uses the pg.x8.2xlarge.2c instance type that provides 16 cores and 128 GB of memory, you set the maintenance_work_mem parameter to the default value 2048 (equivalent to 2 GB), and the nytimes-256-angular dataset contains approximately 324 MB of data. In this case, the configuration of the max_parallel_maintenance_workers parameter has the following impacts on index creation.

    max_parallel_maintenance_workers

    Index creation time (unit: seconds)

    1

    76.00

    2

    51.34

    4

    32.49

    8

    19.66

    12

    14.44

    16

    13.07

    24

    13.15

Impacts of the parameter configurations of vectors on index creation

If you set the maintenance_work_mem parameter to 8 GB (equivalent to 8,388,608 KB) and the max_parallel_maintenance_workers parameter to 16 for the RDS instance, the parameter configurations of the RDS instance have the following impacts on index creation.

  • Vector dimension

    The GloVe dataset that contains 1,183,514 rows of data is used. If you set the m parameter to 16, the efConstruction parameter to 64, and the ef_search parameter to 40 for the ANN-Benchmarks tool, the following results are returned. The test results show that the index creation time increases, the recall rate and QPS decrease, and the query latency increases as the vector dimension increases.

    Dimension

    Index creation time (unit: seconds)

    Recall rate

    QPS

    P99 (ms)

    25

    195.10

    0.99985

    192.94

    7.84

    50

    236.92

    0.99647

    152.36

    9.69

    100

    319.36

    0.97231

    126.89

    11.14

    200

    529.33

    0.93186

    95.05

    15.11

    Note

    P99: the 99th percentile of latency after the RTs of all query requests are sorted in ascending order. This indicates that the RTs of 99% of all query requests are lower than this value.

  • Rows of vectors

    The dbpedia-openai-{n}k-angular dataset is used, and the number of rows of vectors that is specified by n ranges from 100 to 1,000. If you set the m parameter to 48, the efConstruction parameter to 256, and the ef_search parameter to 200 for the ANN-Benchmarks tool, the following results are returned. The test results show that index creation time nonlinearly increases, the recall rate and QPS decrease, and the query latency increases as the number of rows of vectors increases.

    Rows of vectors

    Number of rows (10,000)

    Index creation time (unit: seconds)

    Recall rate

    QPS

    P99 (ms)

    100

    10

    54.05s

    0.9993

    171.74

    8.93

    200

    20

    137.23

    0.99901

    146.78

    10.81

    500

    50

    436.68

    0.999

    118.55

    13.94

    1000

    100

    957.26

    0.99879

    101.60

    16.35

Appendix 2: Custom test datasets

The following datasets are generated based on the format of the nytimes-256-angular dataset and are provided only for reference.

  1. Run the following commands to create a custom dataset:

    Note

    In this example, rds_ai is installed. For more information, see Use the AI capabilities provided by the rds_ai extension.

    import h5py
    import numpy as np
    import psycopg2
    import pgvector.psycopg2
    
    # Assume
    conn_info = {
        'host': 'pgm-****.rds.aliyuncs.com',
        'user': 'ann_testuser',
        'password': '****',
        'port': '5432',
        'dbname': 'ann_testdb'
    }
    
    embedding_len = 1024
    distance_top_n = 100
    query_batch_size = 100
    
    try:
        # Connect to the RDS instance.
        with psycopg2.connect(**conn_info) as connection:
            pgvector.psycopg2.register_vector(connection)
            with connection.cursor() as cur:
                # Obtain vector data.
                cur.execute("select count(1) from test_rag")
                count = cur.fetchone()[0]
    
                train_embeddings = []
                for start in range(0, count, query_batch_size):
                    query = f"SELECT embedding FROM test_rag ORDER BY id OFFSET {start} LIMIT {query_batch_size}"
                    cur.execute(query)
                    res = [embedding[0] for embedding in cur.fetchall()]
                    train_embeddings.extend(res)
                train = np.array(train_embeddings)
    
                # Obtain the query data and calculate the embedding.
                with open('query.txt', 'r', encoding='utf-8') as file:
                    queries = [query.strip() for query in file]
                test = []
                # Install the rds_ai extension or use Alibaba Cloud Model Studio SDK.
                for query in queries:
                    cur.execute(f"SELECT rds_ai.embed('{query.strip()}')::vector(1024)")
                    test.extend([cur.fetchone()[0]])
                test = np.array(test)
    
        # Calculate the distances between the top N nearest neighbors and the required node.
        dot_product = np.dot(test, train.T)
        norm_test = np.linalg.norm(test, axis=1, keepdims=True)
        norm_train = np.linalg.norm(train, axis=1, keepdims=True)
        similarity = dot_product / (norm_test * norm_train.T)
        distance_matrix = 1 - similarity
    
        neighbors = np.argsort(distance_matrix, axis=1)[:, :distance_top_n]
        distances = np.take_along_axis(distance_matrix, neighbors, axis=1)
    
        with h5py.File('custom_dataset.hdf5', 'w') as f:
            f.create_dataset('distances', data=distances)
            f.create_dataset('neighbors', data=neighbors)
            f.create_dataset('test', data=test)
            f.create_dataset('train', data=train)
            f.attrs.update({
                "type": "dense",
                "distance": "angular",
                "dimension": embedding_len,
                "point_type": "float"
            })
    
        print("The HDF5 file is created and the dataset is added.")
    
    except (Exception, psycopg2.DatabaseError) as error:
        print(f"Error: {error}")
    
  2. In the DATASETS section of the ~/ann-benchmarks/ann_benchmarks/datasets.py file, add a custom dataset.

    DATASETS: Dict[str, Callable[[str], None]] = {
      ......,
      "<custom_dataset>": None,
    }
  3. Upload the custom dataset to the ~/ann-benchmarks directory and use the dataset to run the run.py test script.