All Products
Search
Document Center

AnalyticDB:Use pgvector-compatible vector search

Last Updated:Mar 28, 2026

If your application already uses pgvector, migrating to AnalyticDB for PostgreSQL requires minimal code changes. AnalyticDB for PostgreSQL is compatible with pgvector SQL syntax and client libraries — the only difference is how you create vector indexes.

How it works

AnalyticDB for PostgreSQL uses its in-house FastANN vector search engine instead of the pgvector extension. When you enable the vector search engine optimization feature, FastANN is installed automatically. You do not need to install pgvector.

The key difference from standard pgvector is index creation syntax. Table structure, data insertion, search operators, vector functions, and client libraries work the same way.

Advantages

AdvantageDetails
No extension installFastANN is pre-installed. No CREATE EXTENSION vector needed.
Full SQL compatibilityReuse pgvector operators (<->, <#>, <=>), functions, and client libraries.
Distributed architectureScales beyond standalone PostgreSQL for large vector datasets.
Better performanceFastANN delivers higher query throughput and lower latency than pgvector.
Hybrid searchQuery optimizers generate combined vector + scalar filter execution plans.
Lower storage costsFLOAT2 type compression and product quantization (PQ) index compression reduce storage usage.

Prerequisites

Before you begin, ensure that you have:

Create a vector table

Use the VECTOR type instead of array types (SMALLINT[], FLOAT2[], FLOAT4[]) for vector columns. Each table supports multiple vector columns.

CREATE TABLE [TABLE_NAME]
(
    C1 DATATYPE,
    C2 DATATYPE,
    ......,
    CN VECTOR(DIM),
    PRIMARY KEY(C1 [,C2,...CN])
) DISTRIBUTED BY (C1);

VECTOR(DIM) specifies the vector column and its dimension.

Example: Create a table named FACE_TABLE with two vector columns of different dimensions.

CREATE TABLE FACE_TABLE (
    C1 INT,
    C2 VECTOR(512) NOT NULL,
    C3 VECTOR(1536) NOT NULL,
    C4 TIMESTAMP NOT NULL,
    C5 VARCHAR(20) NOT NULL,
    PRIMARY KEY (C1)
) DISTRIBUTED BY (C1);

Create a vector index

AnalyticDB for PostgreSQL uses FastANN's USING ANN syntax to create vector indexes, not pgvector's USING hnsw or USING ivfflat syntax.

CREATE INDEX [INDEX_NAME]
ON [SCHEMA_NAME].[TABLE_NAME]
USING ANN(COLUMN_NAME)
WITH (DISTANCEMEASURE=<MEASURE>,
      HNSW_M=<M>,
      HNSW_EF_CONSTRUCTION=<EF_CONSTRUCTION>,
      PQ_ENABLE=<PQ_ENABLE>,
      PQ_SEGMENTS=<PQ_SEGMENTS>,
      PQ_CENTERS=<PQ_CENTERS>,
      EXTERNAL_STORAGE=<EXTERNAL_STORAGE>);

Because the VECTOR type already encodes the dimension, the DIM parameter is not required. For the full parameter reference, see Create a vector index.

Supported distance measures and their search operators:

Distance measureDISTANCEMEASURE valueSearch operatorSorts by
Euclidean distancel2<->Ascending (smaller = closer)
Inner productip<#>Descending (more negative = higher similarity)
Cosine similaritycosine<=>Ascending cosine distance (smaller = more similar)
Each operator uses the index only when the index was created with the matching distance measure. A mismatch causes the query to fall back to an exact sequential scan.

Example: Create L2, inner product, and cosine indexes on both vector columns of FACE_TABLE.

-- Indexes on C2 (512 dimensions)
CREATE INDEX idx_c2_l2     ON FACE_TABLE USING ann(C2) WITH (distancemeasure=l2,     hnsw_m=64, pq_enable=1);
CREATE INDEX idx_c2_ip     ON FACE_TABLE USING ann(C2) WITH (distancemeasure=ip,     hnsw_m=64, pq_enable=1);
CREATE INDEX idx_c2_cosine ON FACE_TABLE USING ann(C2) WITH (distancemeasure=cosine, hnsw_m=64, pq_enable=1);

-- Indexes on C3 (1536 dimensions)
CREATE INDEX idx_c3_l2     ON FACE_TABLE USING ann(C3) WITH (distancemeasure=l2,     hnsw_m=64, pq_enable=1);
CREATE INDEX idx_c3_ip     ON FACE_TABLE USING ann(C3) WITH (distancemeasure=ip,     hnsw_m=64, pq_enable=1);
CREATE INDEX idx_c3_cosine ON FACE_TABLE USING ann(C3) WITH (distancemeasure=cosine, hnsw_m=64, pq_enable=1);

Create indexes based on the distance measures your queries actually use. Unused indexes consume storage without benefit.

Insert vector data

Use the same INSERT syntax as pgvector:

INSERT INTO FACE_TABLE
VALUES (1, '[1,2,3 ... 512]', '[1,2,3 ... 1536]', '2023-12-29 00:00:00', 'aaa.bbb.ccc/face1.jpg');

Search vectors

Choose between exact search and approximate search

Choose the search method based on your requirements:

FactorUse exact searchUse approximate index search
Dataset sizeSmallLarge
Recall requirement100% recall neededSmall recall trade-off acceptable
Query volumeLow queries per secondHigh queries per second
  • Exact search (no index): scans all vectors and returns precise results, but does not scale to large datasets or high query volumes.

  • Approximate index search: uses the ANN index to retrieve results faster, trading a small reduction in recall for significantly better performance.

Exact search uses vector functions:

-- Euclidean distance
SELECT C1 FROM FACE_TABLE ORDER BY vector_l2_squared_distance(C2, '[1,2,3 ... 512]') LIMIT 10;
-- Inner product
SELECT C1 FROM FACE_TABLE ORDER BY vector_negative_inner_product(C2, '[1,2,3 ... 512]') LIMIT 10;
-- Cosine similarity
SELECT C1 FROM FACE_TABLE ORDER BY cosine_distance(C2, '[1,2,3 ... 512]') LIMIT 10;

Approximate index search uses operators:

-- Euclidean distance (<-> uses the l2 index)
SELECT C1 FROM FACE_TABLE ORDER BY C2 <-> '[1,2,3 ... 512]' LIMIT 10;
-- Inner product (<#> uses the ip index)
SELECT C1 FROM FACE_TABLE ORDER BY C2 <#> '[1,2,3 ... 512]' LIMIT 10;
-- Cosine similarity (<=> uses the cosine index)
SELECT C1 FROM FACE_TABLE ORDER BY C2 <=> '[1,2,3 ... 512]' LIMIT 10;

For recall rate tuning parameters, see Perform vector search.

Verify that the index is used

To confirm your query uses the vector index rather than falling back to a sequential scan, run EXPLAIN ANALYZE:

EXPLAIN ANALYZE SELECT C1 FROM FACE_TABLE ORDER BY C2 <-> '[1,2,3 ... 512]' LIMIT 10;

If the output shows Seq Scan instead of an index scan, the query is not using the vector index. Check that the operator matches the index's distance measure.

Return distance scores

Each operator returns a raw score with a specific mathematical meaning. Use a subquery to sort by the index result first, then convert the score in the outer query:

OperatorRaw score meaningConversion to actual distance
<->Squared Euclidean distancesqrt(score)
<#>Negative inner product (returns negative values)-1 * score
<=>Cosine distance (0–2 range)1.0 - score gives cosine similarity
-- Euclidean distance: <-> returns the squared distance, so take the square root
SELECT t.C1, sqrt(t.score) AS score
FROM (
    SELECT C1, C2 <-> '[1,2,3 ... 512]' AS score
    FROM FACE_TABLE
    ORDER BY score LIMIT topk
) t;

-- Inner product: <#> returns the negative inner product, so multiply by -1
SELECT t.C1, (-1 * t.score) AS score
FROM (
    SELECT C1, C2 <#> '[1,2,3 ... 512]' AS score
    FROM FACE_TABLE
    ORDER BY score LIMIT topk
) t;

-- Cosine similarity: <=> returns the cosine distance, so subtract from 1.0
SELECT t.C1, (1.0 - t.score) AS score
FROM (
    SELECT C1, C2 <=> '[1,2,3 ... 512]' AS score
    FROM FACE_TABLE
    ORDER BY score LIMIT topk
) t;

Filter by score range

Apply a score threshold in the outer query after sorting by the index result:

-- Euclidean: keep results with distance < 100
SELECT t.C1, sqrt(t.score) AS score
FROM (
    SELECT C1, C2 <-> '[1,2,3 ... 512]' AS score
    FROM FACE_TABLE
    ORDER BY score LIMIT topk
) t
WHERE score < 100;

-- Inner product: keep results with similarity > 10
SELECT t.C1, (-1 * t.score) AS score
FROM (
    SELECT C1, C2 <#> '[1,2,3 ... 512]' AS score
    FROM FACE_TABLE
    ORDER BY score LIMIT topk
) t
WHERE score > 10;

-- Cosine: keep results with similarity > 0.5
SELECT t.C1, (1.0 - t.score) AS score
FROM (
    SELECT C1, C2 <=> '[1,2,3 ... 512]' AS score
    FROM FACE_TABLE
    ORDER BY score LIMIT topk
) t
WHERE score > 0.5;

Hybrid search

Combine vector search with scalar filters using the same syntax as standard AnalyticDB for PostgreSQL hybrid search. For details, see Use hybrid search.

Example: Find the 10 most similar faces captured between October 2023 and January 2024:

SELECT C1 FROM FACE_TABLE
WHERE C4 > '2023-10-01 00:00:00'
  AND C4 < '2024-01-01 00:00:00'
ORDER BY C2 <-> '[1,2,3 ... 512]'
LIMIT 10;

Migrate pgvector application code

This section shows how to adapt an existing pgvector application to work with AnalyticDB for PostgreSQL, using Dify as an example.

All changes are in api/core/rag/datasource/vdb/pgvector/pgvector.py. Three modifications are required:

1. Remove the pgvector extension install

FastANN is pre-installed. Comment out the CREATE EXTENSION call:

with self._get_cursor() as cur:
    # Comment out — AnalyticDB for PostgreSQL does not use the pgvector extension
    # cur.execute("CREATE EXTENSION IF NOT EXISTS vector")
    cur.execute(SQL_CREATE_TABLE.format(table_name=self.table_name, dimension=dimension))

2. Replace `USING heap` with `DISTRIBUTED BY`

AnalyticDB for PostgreSQL is built on PostgreSQL 9.4 and does not support the USING heap syntax. It also requires a distribution key.

# Before
SQL_CREATE_TABLE = """
CREATE TABLE IF NOT EXISTS {table_name} (
    id uuid PRIMARY KEY,
    text text NOT NULL,
    meta jsonb NOT NULL,
    embedding vector({dimension}) NOT NULL
) USING heap;
"""

# After
SQL_CREATE_TABLE = """
CREATE TABLE IF NOT EXISTS {table_name} (
    id uuid PRIMARY KEY,
    text text NOT NULL,
    meta jsonb NOT NULL,
    embedding vector({dimension}) NOT NULL
) DISTRIBUTED BY (id);
"""

3. Add FastANN index creation

Add a SQL_CREATE_INDEX statement using USING ann syntax and call it after table creation:

SQL_CREATE_INDEX = """
CREATE INDEX ON {table_name} USING ann(embedding)
WITH (HNSW_M=16, HNSW_EF_CONSTRUCTION=500, PQ_ENABLE=1);
"""

def _create_collection(self, dimension: int):
    cache_key = f"vector_indexing_{self._collection_name}"
    lock_name = f"{cache_key}_lock"
    with redis_client.lock(lock_name, timeout=20):
        collection_exist_cache_key = f"vector_indexing_{self._collection_name}"
        if redis_client.get(collection_exist_cache_key):
            return

        with self._get_cursor() as cur:
            # Comment out — AnalyticDB for PostgreSQL does not use the pgvector extension
            # cur.execute("CREATE EXTENSION IF NOT EXISTS vector")
            cur.execute(SQL_CREATE_TABLE.format(table_name=self.table_name, dimension=dimension))
            # Create a FastANN vector index
            cur.execute(SQL_CREATE_INDEX.format(table_name=self.table_name))
        redis_client.set(collection_exist_cache_key, 1, ex=3600)

References

Supported client libraries

All pgvector client libraries work with AnalyticDB for PostgreSQL. The only code change required is the index creation SQL.

LanguageClient library
Cpgvector-c
C++pgvector-cpp
Gopgvector-go
Java, Kotlin, Groovy, Scalapgvector-java
PHPpgvector-php
Pythonpgvector-python
Rustpgvector-rust

Supported vector functions

CategoryFunctionReturn typeDescription
Calculationl2_distanceDOUBLE PRECISIONEuclidean distance between two vectors
inner_productDOUBLE PRECISIONInner product distance; equivalent to cosine similarity after normalization
cosine_similarityDOUBLE PRECISIONCosine similarity (-1 to 1)
vector_dimsINTEGERNumber of dimensions
vector_normDOUBLE PRECISIONVector norm
vector_addVECTORElement-wise sum
vector_subVECTORElement-wise difference
vector_mulVECTORElement-wise product
vector_angleDOUBLE PRECISIONAngle between two vectors
Sortingvector_l2_squared_distanceDOUBLE PRECISIONSquared Euclidean distance (faster than l2_distance for sorting)
vector_negative_inner_productDOUBLE PRECISIONNegative inner product (used to sort by inner product in descending order)
cosine_distanceDOUBLE PRECISIONCosine distance (0 to 2); used to sort by cosine similarity in descending order

For distance formula details, see Create a vector index.

Examples:

SELECT l2_distance('[1,1,1,1]'::vector, '[2,2,2,2]'::vector);
SELECT inner_product('[1,1,1,1]'::vector, '[2,2,2,2]'::vector);
SELECT cosine_similarity('[1,1,1,1]'::vector, '[2,2,2,2]'::vector);
SELECT vector_dims('[1,1,1,1]'::vector);
SELECT vector_norm('[1,1,1,1]'::vector);
SELECT vector_add('[1,1,1,1]'::vector, '[2,2,2,2]'::vector);
SELECT vector_sub('[1,1,1,1]'::vector, '[2,2,2,2]'::vector);
SELECT vector_mul('[1,1,1,1]'::vector, '[2,2,2,2]'::vector);
SELECT vector_angle('[1,1,1,1]'::vector, '[2,2,2,2]'::vector);
SELECT vector_l2_squared_distance('[1,1,1,1]'::vector, '[2,2,2,2]'::vector);
SELECT vector_negative_inner_product('[1,1,1,1]'::vector, '[2,2,2,2]'::vector);
SELECT cosine_distance('[1,1,1,1]'::vector, '[2,2,2,2]'::vector);

Supported vector operators

OperatorCalculationSortingData type
<->Squared Euclidean distanceAscendingVECTOR
<#>Negative inner productDescending by dot productVECTOR
<=>Cosine distanceAscending (lower = more similar)VECTOR
+Element-wise sumVECTOR
-Element-wise differenceVECTOR
*Element-wise productVECTOR

Examples:

SELECT '[1,1,1,1]'::vector <-> '[2,2,2,2]'::vector AS score;
SELECT '[1,1,1,1]'::vector <#> '[2,2,2,2]'::vector AS score;
SELECT '[1,1,1,1]'::vector <=> '[2,2,2,2]'::vector AS score;
SELECT '[1,1,1,1]'::vector + '[2,2,2,2]'::vector AS value;
SELECT '[1,1,1,1]'::vector - '[2,2,2,2]'::vector AS value;
SELECT '[1,1,1,1]'::vector * '[2,2,2,2]'::vector AS value;

What's next