pgvector is a powerful extension that is supported by ApsaraDB RDS for PostgreSQL to perform vector similarity searches. The extension enables the system to support a new data type and allows you to perform vector similarity searches in an efficient manner.
Background information
ApsaraDB RDS for PostgreSQL supports the pgvector extension. The extension can be used to store vector data, perform vector similarity searches, and enrich the underlying data for AI services.
The pgvector extension provides the following benefits:
It enables the system to support vector data. You can use the pgvector extension to store and search for vector data.
It supports exact and approximate nearest neighbor (ANN) searches. This allows you to calculate the distances or similarities between vectors in a vector space by L2 Euclidean distance, cosine similarity, or inner product. It supports the construction of Hierarchical Navigable Small World (HNSW) indexes and IVFFLAT parallel indexes, element-by-element multiplication of vectors, L1 distance functions, and sum aggregation.
It allows you to create vectors with up to 16,000 dimensions and create indexes for vectors with up to 2,000 dimensions.
Terms and implementation principles
embedding
Embedding is a technique that translates high-dimensional data into a low-dimensional space. In machine learning and natural language processing (NLP), embedding is a common method that is used to represent sparse symbols or objects as continuous vectors.
In NLP, word embedding is an approach that is used to represent words in the form of real-valued vectors. This allows computers to better understand text-based content. Word embedding is used to translate the semantics and grammatical relations of words into a vector space.
Implementation principles
Embedding is a technique that abstracts data such as words, images, audio content, and videos from multiple dimensions and translates the data into vectors.
The pgvector extension supports vector data, which allows ApsaraDB RDS for PostgreSQL to store vector data.
The pgvector extension can be used to perform exact and ANN searches on stored vector data.
Assume that you need to store three objects (an apple, a banana, and a cat) in an RDS instance and use the pgvector extension to calculate the similarities among the objects. You can perform the following steps:
Use embeddings to translate the apple, banana, and cat objects into vectors. The following section provides an example of two-dimensional embeddings:
Apple: embedding [1,1] Banana: embedding [1.2,0.8] Cat: embedding [6,0.4]
Store the vector data in the RDS instance. For more information, see Examples.
The following figure shows the distribution of the apple, banana, and cat objects in a two-dimensional space.
The apple and the banana are closer because they are both fruits and are more similar to each other. The cat is an animal and is located far from the fruits.
The attributes of fruits can be grouped in terms of dimensions such as color, place of origin, and taste. If you group the fruits from finer-grained dimensions, you can obtain more accurate results.
Scenarios
You want to store vector data.
You want to perform vector similarity searches.
Prerequisites
The RDS instance meets the following requirements:
The RDS instance runs PostgreSQL 14 or later.
The minor engine version of your RDS instance is 20230430 or later. If your RDS instance runs PostgreSQL 17, the minor engine version of the instance must be 20241030 or later.
NoteFor more information about how to upgrade the major engine version or update the minor engine version of the RDS instance, see Upgrade the major engine version of an ApsaraDB RDS for PostgreSQL instance or Update the minor engine version of an ApsaraDB RDS for PostgreSQL instance.
A priviledged account is created for the RDS instance. For more information, see Create an account.
Extension management
The following operations can be performed only by the priviledged account. For more information about how to create a priviledged account, see Create an account.
Create the extension.
CREATE EXTENSION IF NOT EXISTS vector;
Delete the extension.
DROP EXTENSION vector;
Update the extension.
ALTER EXTENSION vector UPDATE [ TO new_version ]
NoteIn the preceding statement, the new_version parameter indicates the version of the pgvector extension. For more information about the latest version and related features of the pgvector extension, see official documentation of pgvector.
Examples
This section provides an example on how to use the pgvector extension. For more information, see official documentation of pgvector.
Create a table named items of the vector type to store embeddings.
CREATE TABLE items ( id bigserial PRIMARY KEY, item text, embedding vector(2) );
NoteIn the preceding statements, two-dimensional vectors are used. The pgvector extension allows you to create vectors with up to 16,000 dimensions.
Insert the vectors into the created table.
INSERT INTO items (item, embedding) VALUES ('Apple', '[1,1]'), ('Banana', '[1.2,0.8]'), ('Cat', '[6,0.4]');
Use the cosine similarity operator (
<=>
) to calculate the similarities between the banana, apple, and cat objects.SELECT item, 1 - (embedding <=> '[1.2, 0.8]') AS cosine_similarity FROM items ORDER BY cosine_similarity DESC;
NoteIn the preceding statements, the
cosine_similarity = 1 - cosine_distance
formula is used to calculate similarities. The shorter the distance between vectors, the more similar the objects are.You can also use the Euclidean distance operator (
<->
) or the inner product operator (<#>
) to calculate similarities.
Sample output:
item | cosine_similarity ------+-------------------- Banana | 1 Apple | 0.980580680748848 Cat | 0.867105556566985
In the preceding output:
A value of 1 for the banana indicates an exact match.
A value of 0.98 for the apple indicates that the apple has a high similarity with the banana.
A value of 0.86 for the cat indicates that the cat has a low similarity with the banana.
NoteYou can specify a similarity threshold based on your business requirements to exclude results with low similarities.
Create an index for vectors to improve the efficiency of vector similarity searches by using the pgvector extension.
Create an HNSW index
The following statement provides an example on how to create an HNSW index:
CREATE INDEX ON items USING hnsw (embedding vector_cosine_ops) WITH (m = 16, ef_construction = 64);
Parameters
Parameter
Description
m
The maximum number of neighboring nodes of each node at each layer when you create the HNSW index.
A large value indicates a high graph density. In most cases, a high graph density increases the recall rate and extends time required to create and query the index.
ef_construction
The size of the candidate set when you create the HNSW index. The size of a candidate set specifies the number of candidate nodes retained to achieve optimal connections.
In most cases, a large value increases the recall rate and extends the time required to create and query the index.
Create an IVF index
The following statement provides an example on how to create an IVF index:
CREATE INDEX ON items USING ivfflat (embedding vector_cosine_ops) WITH (lists = 100);
Parameters
Parameter/Value
Description
items
The name of the table for which the index is created.
embedding
The name of the column for which the index is created.
vector_cosine_ops
The access mode that is specified in the vector index method.
For vector similarity searches by cosine similarity, set this parameter to
vector_cosine_ops
.For vector similarity searches by Euclidean distance, set this parameter to
vector_l2_ops
.For vector similarity searches by inner product, set this parameter to
vector_ip_ops
.
lists = 100
The lists parameter indicates the number of lists into which the dataset is divided. A larger value indicates that the dataset is divided into more lists, the size of each subset is smaller, and the index search speed is faster. As the value of the lists parameter increases, the search recall rate may decrease.
NoteThe recall rate is the ratio of the number of exactly searched samples or classified samples to the total number of relevant samples in an information search or classification task. The recall rate is a metric that is used to measure the capability of the system to identify all relevant samples.
Index creation requires a large number of memory resources. If the value of the lists parameter is larger than 2000, the
ERROR: memory required is xxx MB, maintenance_work_mem is xxx MB
error message is displayed. In this case, you must change the value of the maintenance_work_mem parameter to a larger value. However, an excessively large value may cause out of memory (OOM) errors. For more information about how to change the value of the parameter, see Modify the parameters of an ApsaraDB RDS for PostgreSQL instance.You can change the value of the lists parameter to make a trade-off between the search speed and the recall rate. This helps meet your requirements in specific scenarios.
You can use one of the following methods to configure the ivfflat.probes parameter. The parameter specifies the number of lists that you want to search in the index. To search more lists and improve the recall rate of the search result, you can increase the value of the ivfflat.probes parameter.
Session level
SET ivfflat.probes = 10;
Transaction level
BEGIN; SET LOCAL ivfflat.probes = 10; SELECT ... COMMIT;
A large value of the ivfflat.probes parameter indicates a high recall rate of the search result but a slow search speed. To ensure the optimal search performance and recall rate, you must adjust the values of the ivfflat.probes and lists parameters based on your business requirements and the characteristics of the dataset.
NoteIf the value of the ivfflat.probes parameter is the same as the value of the lists parameter that is specified during the index creation, vector indexes are ignored and a full table scan is performed for your search. In this case, indexes are not used, and the full table is scanned. This may degrade search performance.
Performance data
When you create indexes for vectors, you must make a trade-off between the search speed and the recall rate based on your business requirements. For more information about the performance test, see the following topics: