The pgvector extension turns PolarDB for PostgreSQL into a vector database, letting you store, query, and aggregate high-dimensional embeddings alongside your relational data. With pgvector, you can:
Store vectors and run nearest-neighbor search
Choose from six distance metrics: L2, inner product, cosine, L1, Hamming, and Jaccard
Use standard SQL — JOINs, transactions, and all other PostgreSQL features work as expected
Before proceeding, verify that your . If it does not, upgrade the revision version.
Prerequisites
Before you begin, ensure that you have:
A PolarDB for PostgreSQL cluster with a revision version that supports pgvector
A privileged account for the target database
Enable the pgvector extension
Connect to the target database using the privileged account and run:
CREATE EXTENSION IF NOT EXISTS vector;The pgvector extension is scoped to the database where it is installed. To use pgvector in multiple databases within the same cluster, run this command in each database separately.
Store vectors
Create a vector table
Create a table with a vector column. The number in vector(n) defines the number of dimensions.
CREATE TABLE items_3 (id bigserial PRIMARY KEY, embedding vector(3));To add a vector column to an existing table:
ALTER TABLE items ADD COLUMN embedding vector(3);Insert vectors
Insert one or more vectors in a single statement:
INSERT INTO items_3 (embedding) VALUES ('[1,2,3]'), ('[4,5,6]');To bulk-load vectors, use the COPY API.
To insert or update on primary key conflict (upsert):
INSERT INTO items_3 (id, embedding) VALUES (1, '[1,2,3]'), (2, '[4,5,6]')
ON CONFLICT (id) DO UPDATE SET embedding = EXCLUDED.embedding;Update and delete vectors
Update a vector:
UPDATE items_3 SET embedding = '[1,2,3]' WHERE id = 1;Delete a row:
DELETE FROM items_3 WHERE id = 1;Query vectors
Distance-based search
Find the nearest neighbors to a query vector using one of the supported distance operators. Combine with ORDER BY and LIMIT to use an index.
| Operator | Metric | Notes |
|---|---|---|
<-> | Euclidean distance (L2) | |
<#> | Inner product | Returns the negative inner product, because PostgreSQL only supports ascending index scans on operators. Multiply by -1 to get the actual value. |
<=> | Cosine distance | Cosine similarity = 1 - cosine distance |
<+> | Manhattan distance (L1) | |
<~> | Hamming distance | Binary vectors only |
<%> | Jaccard distance | Binary vectors only |
Examples
Find the five nearest neighbors to a query vector:
SELECT * FROM items ORDER BY embedding <-> '[3,1,2]' LIMIT 5;Find the five nearest neighbors to a specific row:
SELECT * FROM items WHERE id != 1
ORDER BY embedding <-> (SELECT embedding FROM items WHERE id = 1)
LIMIT 5;Find rows within a specific distance:
SELECT * FROM items WHERE embedding <-> '[3,1,2]' < 5;Get the distance between each row and a query vector:
SELECT embedding <-> '[3,1,2]' AS distance FROM items;Get the inner product:
SELECT (embedding <#> '[3,1,2]') * -1 AS inner_product FROM items;Get cosine similarity:
SELECT 1 - (embedding <=> '[3,1,2]') AS cosine_similarity FROM items;Aggregate vectors
Average all vectors in a table:
SELECT AVG(embedding) FROM items;Average vectors by group:
SELECT id, AVG(embedding) FROM items GROUP BY id;What's next
pgvector indexing — create HNSW or IVFFlat indexes to accelerate nearest-neighbor queries at scale
Version management — manage and upgrade your cluster's revision version