All Products
Search
Document Center

ApsaraDB RDS:Use the pgvector extension to test performance based on IVF indexes

Last Updated:Mar 28, 2026

This document shows benchmark results for IVFFlat indexes on ApsaraDB RDS for PostgreSQL. Use these results to understand how data volume affects storage, and how the lists and probes parameters trade off query throughput against recall rate before you tune them for your workload.

How IVFFlat indexes work

IVFFlat divides vectors into clusters during index construction. At query time, the database searches only a subset of clusters rather than all vectors.

ParameterRoleEffect of a higher value
listsNumber of clusters built during index creationFaster queries (fewer vectors per cluster), but lower recall when query vectors land near cluster boundaries
probesNumber of clusters searched at query timeHigher recall, but slower queries

The two parameters have opposite effects on recall and throughput.

Test environment

The RDS instance and ECS instance must be in the same virtual private cloud (VPC) and vSwitch to avoid network-related variance in test results.

ComponentSpecification
RDS instancePostgreSQL 16, RDS High-availability Edition, pg.x8.2xlarge.2c dedicated instance (16 cores, 128 GB memory)
pgvector version0.8.0
ECS instanceecs.c6.xlarge (4 cores, 8 GiB memory), Alibaba Cloud Linux 3
PostgreSQL client15.1
Test toolpgbench

Prerequisites

Before you begin, ensure that you have:

Set up test data

  1. Connect to testdb and create a helper function that generates random vectors of a given length:

    CREATE OR REPLACE FUNCTION random_array(dim integer)
        RETURNS DOUBLE PRECISION[]
    AS $$
        SELECT array_agg(random())
        FROM generate_series(1, dim);
    $$
    LANGUAGE SQL
    VOLATILE
    COST 1;
  2. Create a table for 1536-dimensional vectors:

    CREATE TABLE vtest(id BIGINT, v VECTOR(1536));
  3. Insert 100,000 rows of test data:

    INSERT INTO vtest
    SELECT i, random_array(1536)::VECTOR(1536)
    FROM generate_series(1, 100000) AS i;
  4. Create an IVFFlat index using cosine distance with 100 lists:

    CREATE INDEX ON vtest USING ivfflat(v vector_cosine_ops) WITH(lists = 100);

Run the benchmark

Use the internal endpoint of the RDS instance to eliminate network latency as a variable.

  1. Create a SQL file named test.sql with the following query. The query generates a random 1536-dimensional vector and retrieves the most similar records from vtest using cosine distance:

    WITH tmp AS (
        SELECT random_array(1536)::VECTOR(1536) AS vec
    )
    SELECT id
    FROM vtest
    ORDER BY v <=> (SELECT vec FROM tmp)
    LIMIT FLOOR(RANDOM() * 50);
  2. Run pgbench from the ECS instance. Make sure the PostgreSQL client is installed. See the pgbench documentation for reference.

    pgbench -f ./test.sql -c6 -T60 -P5 -U testuser -h pgm-bp****.pg.rds.aliyuncs.com -p 5432 -d testdb
    ParameterDescription
    -f ./test.sqlPath to the test SQL file. Replace with the actual path
    -c6Number of concurrent client connections (6 in this test)
    -T60Test duration in seconds (60 seconds in this test)
    -P5Progress report interval in seconds (every 5 seconds)
    -U testuserDatabase username. Replace with your username
    -h pgm-bp****.pg.rds.aliyuncs.comInternal endpoint of the RDS instance
    -p 5432Internal port of the RDS instance
    -d testdbTarget database

Test results

Storage and throughput by data volume

The following results use lists = 100. Index size stays close to table size at all data volumes, which means the lists value has minimal impact on storage — storage is primarily determined by data volume.

Data volumeTable sizeIndex sizeLatencyTPS
100,000 rows796 MB782 MB15.7 ms380
300,000 rows2,388 MB2,345 MB63 ms94
500,000 rows3,979 MB3,907 MB74 ms80
800,000 rows6,367 MB6,251 MB90 ms66
1,000,000 rows7,958 MB7,813 MB105 ms56

Impact of probes on recall and throughput

With lists = 2000 and 1,000,000 rows: a higher probes value indicates a higher recall rate but a lower TPS.

image..png

Impact of lists on recall and throughput

With probes = 20 and 1,000,000 rows: a higher lists value indicates a lower recall rate but a higher TPS.

image..png

Tune lists and probes for production

The following chart shows the combined trade-off between recall and throughput as both parameters vary.

image..png

Use these formulas as a starting point based on the number of rows in your table:

Up to 1,000,000 rows:

lists  = row_count / 1,000
probes = lists / 10

Over 1,000,000 rows:

lists  = sqrt(row_count)
probes = sqrt(lists)
Note

sqrt is the square root function. After applying the formulas, adjust probes upward if recall is insufficient, or downward to improve throughput.

Related topics