SQL references

pgvector is an open source vector similarity search tool for PostgreSQL. It allows you to store vector data alongside other data types in a database and provides the following features and functionalities:

Single-precision, half-precision, binary, and sparse vectors.
Mainstream distance measures including Manhattan distance (L1), Euclidean distance (L2), inner product, cosine distance, Hamming distance, and Jaccard distance.
Exact and approximate nearest neighbor search.
Mainstream vector approximate similarity search indexes such as Hierarchical Navigable Small World (HNSW) and Inverted File with Flat Compression (IVFFlat).
Various PostgreSQL client languages.

This extension also supports powerful PostgreSQL features, such as atomicity, consistency, isolation, and durability (ACID) compliance, point-in-time recovery, and JOINs.

Copyright notice

PolarDB for PostgreSQL incorporates the pgvector extension. The pgvector extension, including its documentation and functionality, was originally developed and maintained by the PostgreSQL Global Development Group. Alibaba Cloud has made modifications to adapt pgvector for its proprietary vector database solution.

Single-precision vectors

Each single-precision vector occupies 4 * dimensions + 8 bytes of storage space. Each element of the vector is a single-precision floating-point number (similar to the real type in PostgreSQL), and all elements must be valid and finite values (cannot be NaN, Infinity, or -Infinity). Single-precision vectors can have up to 16,000 dimensions.

Operators

Operator	Description	Version requirement
`+`	Element-wise addition	None
`-`	Element-wise subtraction	None
`*`	Element-wise multiplication	0.5.0 or later
`\|\|`	Concatenation	0.7.0 or later
`<->`	Euclidean distance	None
`<#>`	Negative inner product	None
`<=>`	Cosine distance	None
`<+>`	Manhattan distance	0.7.0 or later

Functions

Function	Description	Version requirement
`binary_quantize(vector) → bit`	Quantizes a vector into binary format.	0.7.0 or later
`cosine_distance(vector, vector) → double precision`	Calculates the cosine distance between vectors.	None
`inner_product(vector, vector) → double precision`	Calculates the inner product of vectors.	None
`l1_distance(vector, vector) → double precision`	Calculates the Manhattan distance between vectors.	0.5.0 or later
`l2_distance(vector, vector) → double precision`	Calculates the Euclidean distance between vectors.	None
`l2_normalize(vector) → vector`	Normalizes a vector by using its Euclidean norm	0.7.0 or later
`subvector(vector, integer, integer) → vector`	Extracts a subvector from a given vector.	0.7.0 or later
`vector_dims(vector) → integer`	Returns the number of dimensions of a vector.	None
`vector_norm(vector) → double precision`	Calculates the Euclidean norm of a vector.	None

Aggregate function	Description	Version requirement
`avg(vector) → vector`	Calculates the average of vectors.	None
`sum(vector) → vector`	Calculates the sum of vectors.	0.5.0 or later

Half-precision vectors

Each half-precision vector occupies 2 * dimensions + 8 bytes of storage space. Each element of the vector is a half-precision floating-point number, and all elements must be valid and finite values (cannot be NaN, Infinity, or -Infinity). Half-precision vectors can have up to 16,000 dimensions.

Operators

Operator	Description	Version requirement
`+`	Element-wise addition	0.7.0 or later
`-`	Element-wise subtraction	0.7.0 or later
`*`	Element-wise multiplication	0.7.0 or later
`\|\|`	Concatenation	0.7.0 or later
`<->`	Euclidean distance	0.7.0 or later
`<#>`	Negative inner product	0.7.0 or later
`<=>`	Cosine distance	0.7.0 or later
`<+>`	Manhattan distance	0.7.0 or later

Functions

Function	Description	Version requirement
`binary_quantize(halfvec) → bit`	Quantizes a half-precision vector into binary format.	0.7.0 or later
`cosine_distance(halfvec, halfvec) → double precision`	Calculates the cosine distance between half-precision vectors.	0.7.0 or later
`inner_product(halfvec, halfvec) → double precision`	Calculates the inner product of half-precision vectors.	0.7.0 or later
`l1_distance(halfvec, halfvec) → double precision`	Calculates the Manhattan distance between half-precision vectors.	0.7.0 or later
`l2_distance(halfvec, halfvec) → double precision`	Calculates the Euclidean distance between half-precision vectors.	0.7.0 or later
`l2_norm(halfvec) → double precision`	Calculates the Euclidean norm of a half-precision vector.	0.7.0 or later
`l2_normalize(halfvec) → halfvec`	Normalizes a half-precision vector by using its Euclidean norm.	0.7.0 or later
`subvector(halfvec, integer, integer) → halfvec`	Extracts a subvector from a given half-precision vector.	0.7.0 or later
`vector_dims(halfvec) → integer`	Returns the number of dimensions of a half-precision vector.	0.7.0 or later

Aggregate function	Description	Version requirement
`avg(halfvec) → halfvec`	Calculates the average of vectors.	0.7.0 or later
`sum(halfvec) → halfvec`	Calculates the sum of vectors.	0.7.0 or later

Binary vectors

Each binary vector occupies dimensions / 8 + 8 bytes of storage space. For more information, see Bit String Types.

Operators

Operator	Description	Version requirement
`<~>`	Hamming distance	0.7.0 or later
`<%>`	Jaccard distance	0.7.0 or later

Functions

Function	Description	Version requirement
`hamming_distance(bit, bit) → double precision`	Calculates the Hamming distance between binary vectors.	0.7.0 or later
`jaccard_distance(bit, bit) → double precision`	Calculates the Jaccard distance between binary vectors.	0.7.0 or later

Sparse vectors

Each sparse vector occupies 8 * number of non-zero elements + 16 bytes of storage space. Each element of the vector is a single-precision floating-point number, and all elements must be valid and finite values (cannot be NaN, Infinity, or -Infinity). Sparse vectors can have up to 16,000 non-zero elements.

Operators

Operator	Description	Version requirement
`<->`	Euclidean distance	0.7.0 or later
`<#>`	Negative inner product	0.7.0 or later
`<=>`	Cosine distance	0.7.0 or later
`<+>`	Manhattan distance	0.7.0 or later

Functions

Function	Description	Version requirement
`cosine_distance(sparsevec, sparsevec) → double precision`	Calculates the cosine distance between sparse vectors.	0.7.0 or later
`inner_product(sparsevec, sparsevec) → double precision`	Calculates the inner product of sparse vectors.	0.7.0 or later
`l1_distance(sparsevec, sparsevec) → double precision`	Calculates the Manhattan distance between sparse vectors.	0.7.0 or later
`l2_distance(sparsevec, sparsevec) → double precision`	Calculates the Euclidean distance between sparse vectors.	0.7.0 or later
`l2_norm(sparsevec) → double precision`	Calculates the Euclidean norm of a sparse vector.	0.7.0 or later
`l2_normalize(sparsevec) → sparsevec`	Normalizes a sparse vector by using its Euclidean norm.	0.7.0 or later