All Products
Search
Document Center

PolarDB:Getting started

Last Updated:Jun 04, 2025

This topic describes the basic methods for using vector databases.

Note

To use the pgvector extension, make sure that the revision version of your cluster meets the requirements. If the revision version does not meet the requirements, upgrade it.

Enable the pgvector extension and create a vector table

Use the privileged account to enable the pgvector extension in the database where you want to use it.

Note

The functionality and capabilities of the pgvector extension are limited to the database in which it is installed. If you want to use the pgvector extension in multiple databases in the same PostgreSQL cluster, separately install and configure the pgvector extension for each database.

CREATE EXTENSION IF NOT EXISTS vector;

After you enable the pgvector extension, you can perform the following steps to create a vector table.

  1. Create a table with a column to hold 3-dimensional vectors.

    CREATE TABLE items (id bigserial PRIMARY KEY, embedding vector(3));
  2. Insert vectors into the table.

    INSERT INTO items (embedding) VALUES ('[1,2,3]'), ('[4,5,6]');
  3. Find the nearest neighbors based on the Euclidean distance (L2 distance).

    SELECT * FROM items ORDER BY embedding <-> '[3,1,2]' LIMIT 5;

    The inner product (<#>), cosine distance (<=>), and L1 distance (<+>) metrics are also supported.

    Note

    <#> returns the negative inner product because PostgreSQL only supports ascending index scans on operators.

Store vectors

  1. Create a table with a column to hold 3-dimensional vectors.

    CREATE TABLE items_3 (id bigserial PRIMARY KEY, embedding vector(3));
    Note

    You can also add a vector column to an existing table by executing the ALTER TABLE items ADD COLUMN embedding vector(3); statement.

  2. Insert vectors.

    • Insert vectors into the table.

      INSERT INTO items_3 (embedding) VALUES ('[1,2,3]'), ('[4,5,6]');
    • Perform the COPY operation via API to load vectors in bulk.

    • Insert vectors while handling potential conflicts, such as primary key conflicts.

      INSERT INTO items_3 (id, embedding) VALUES (1, '[1,2,3]'), (2, '[4,5,6]')
          ON CONFLICT (id) DO UPDATE SET embedding = EXCLUDED.embedding;

      In this case, the embedding column of the existing row is updated with the new value if a primary key conflict occurs between the inserted and existing data. This ensures data consistency and uniqueness.

  3. Update vectors.

    UPDATE items_3 SET embedding = '[1,2,3]' WHERE id = 1;
  4. Delete vectors.

    DELETE FROM items_3 WHERE id = 1;

Query vectors

Distance

Find the nearest neighbors to a vector.

SELECT * FROM items ORDER BY embedding <-> '[3,1,2]' LIMIT 5;

The supported distance metrics include:

  • <->: Euclidean distance (L2).

  • <#>: Inner product.

  • <=>: Cosine distance.

  • <+>: Manhattan distance (L1).

  • <~>: Hamming distance (binary vectors).

  • <%>: Jaccard distance (binary vectors).

Examples

Note

Combine with ORDER BY and LIMIT to use an index.

  • Find the top five nearest neighbors to a row.

    SELECT * FROM items WHERE id != 1 ORDER BY embedding <-> (SELECT embedding FROM items WHERE id = 1) LIMIT 5;
  • Find rows within a specific distance.

    SELECT * FROM items WHERE embedding <-> '[3,1,2]' < 5;
  • Get the distance.

    SELECT embedding <-> '[3,1,2]' AS distance FROM items;
  • Get the inner product.

    Note

    <#> returns the negative of the inner product. The result is multiplied by -1 to obtain the actual inner product.

    SELECT (embedding <#> '[3,1,2]') * -1 AS inner_product FROM items;
  • Get the cosine similarity, which is derived by subtracting the cosine distance from 1.

    SELECT 1 - (embedding <=> '[3,1,2]') AS cosine_similarity FROM items;

Aggregate vectors

  • Average vectors.

    SELECT AVG(embedding) FROM items;
  • Average groups of vectors.

    SELECT id, AVG(embedding) FROM items GROUP BY id;