Build a Nova Vector Index in PostgreSQL V7.0 for Fast Similarity Search - AnalyticDB

Nova is the new-generation vector search engine for AnalyticDB for PostgreSQL V7.0. It comes in two modes — disk-based (Novad) and memory-optimized (Novam) — to support workloads ranging from large-scale cost-sensitive retrieval to high-performance real-time applications. This topic describes how to choose a mode, size your resources, and create a Nova index.

Prerequisites

Before you begin, ensure that you have:

An AnalyticDB for PostgreSQL V7.0 instance running minor engine version 7.4.2.0 or later. To upgrade, contact technical support.
Note
You can view the minor version on the Basic Information page of an instance in the AnalyticDB for PostgreSQL console. If your instance does not meet the required versions, update the minor version of the instance.
Vector search engine optimization enabled

Advantages

Compared to Hierarchical Navigable Small World (HNSW) indexes, Nova indexes offer three advantages:

Faster queries: Reduced query latency for vector search.
Lower memory usage: Novad stores the bulk of the index on disk, cutting memory costs significantly.
Higher write throughput: Index building runs independently of data writes, improving ingestion performance.

Choose an index mode

Nova has two modes. Use the following table to decide which fits your workload.

	Novad	Novam
Best for	Large-scale, cost-sensitive retrieval (tens to hundreds of billions of vectors)	High-performance scenarios such as real-time recommendations
Index type	Hybrid graph + partition (HNSW in memory, IVF on disk)	Graph index (fully in memory, spills to disk when needed)
Memory dependency	Low — stable performance even when index size exceeds available memory	High — performance scales with memory allocation
Query performance	Good	Better than Novad at the same instance specs when memory is sufficient
Index build performance	Faster	Slower
Disk usage	Lower	Higher

Important

Nova runs optimization tasks in the background. These tasks consume resources even when there is no active workload.

Resource sizing

The following tables list recommended total compute resources by vector dimension and dataset size. These are starting points — add resources if your data volume exceeds the listed thresholds.

Novad

Vector dimensions	Number of vectors	Recommended total compute resources
128	< 320 M	8 cores
256	< 160 M
512	< 80 M
768	< 50 M
1024	< 40 M
1536	< 26 M
2048	< 20 M
128	< 640 M	16 cores
256	< 320 M
512	< 160 M
768	< 100 M
1024	< 80 M
1536	< 60 M
2048	< 40 M
128	< 1.28 B	32 cores
256	< 640 M
512	< 320 M
768	< 200 M
1024	< 160 M
1536	< 120 M
2048	< 80 M
128	< 5.12 B	128 cores
256	< 2.56 B
512	< 1.28 B
768	< 800 M
1024	< 640 M
1536	< 480 M
2048	< 320 M
128	< 200 B	4096 cores
256	< 100 B
512	< 50 B
768	< 33 B
1024	< 25 B
1536	< 16 B
2048	< 12 B
128	< 1.6 T	32768 cores
256	< 800 B
512	< 400 B
768	< 260 B
1024	< 200 B
1536	< 130 B
2048	< 100 B

Novam

Vector dimensions	Number of vectors	Recommended total compute resources
128	< 32 M	8 cores
256	< 16 M
512	< 8 M
768	< 5 M
1024	< 4 M
1536	< 2.6 M
2048	< 2 M
128	< 64 M	16 cores
256	< 32 M
512	< 16 M
768	< 10 M
1024	< 8 M
1536	< 5 M
2048	< 4 M
128	< 128 M	32 cores
256	< 64 M
512	< 32 M
768	< 20 M
1024	< 16 M
1536	< 10 M
2048	< 8 M

Create a Nova index

Syntax

CREATE INDEX [index_name]
ON [schema_name].[table_name]
USING ANN(column_name)
WITH (
    DIM = <dimension>,
    ALGORITHM = <algorithm>,
    DISTANCEMEASURE = <measure>,
    ...
);

Parameters

Parameter	Description	Default	Valid values
`dim`	Vector dimensions. Required.	—	1–8192
`algorithm`	Index algorithm: `novam` (graph index, no quantization), `novad` (partitioned index with RaBitQ quantization), or `hnswflat` (HNSW without quantization).	`hnswflat`	`novam`, `novad`, `hnswflat`
`distancemeasure`	Distance metric: `L2` (squared Euclidean distance, typically used for image similarity), `IP` (inverse inner product, used as a substitute for cosine similarity after vector normalization), or `COSINE` (cosine distance, typically used for text similarity).	`l2`	`L2`, `IP`, `COSINE`
`max_delta_vecs`	Maximum number of vectors in a write batch.	1048576	1024–1073741824

Novam-specific parameters

Parameter	Description	Default	Valid values
`hnsw_m`	Number of neighbors per node in the graph. A higher value improves graph quality but increases build time.	16	10–1000
`hnsw_ef_construction`	Size of the candidate set used during graph construction. A higher value improves recall but increases build time.	64	40–4000
`base_slice_log2_size`	Log base 2 of the file shard size.	24	10–30

Novad-specific parameters

Parameter	Description	Default	Valid values
`nlist`	Number of partition lists.	1024	2–1073741824
`accel_m`	Number of neighbors in the acceleration layer. A higher value improves query performance at the cost of more memory and build time.	16	8–1024
`accel_efc`	Size of the candidate set for building the acceleration layer. A higher value improves index quality but increases build time.	128	1–32768
`rabitq_bits`	Number of bits for RaBitQ quantization. Higher values preserve more precision but use more disk space.	1	1–8
`max_cluster_vecs`	Maximum number of vectors per partition center point. Reduce this value if individual partitions become too large.	65536	1–10000000

Examples

Create a table

CREATE TABLE chunks (
    id SERIAL PRIMARY KEY,
    chunk VARCHAR(1024),
    intime TIMESTAMP,
    url VARCHAR(1024),
    feature REAL[]
) DISTRIBUTED BY (id);

Create a Novad index with cosine distance

Use Novad for large-scale, cost-sensitive workloads.

CREATE INDEX idx_feature_novad_cosine ON chunks
USING ann(feature)
WITH (
    dim = 1536,
    algorithm = novad,
    distancemeasure = cosine,
    nlist = 4096,
    rabitq_bits = 1
);

Create a Novam index with L2 distance

Use Novam for high-performance scenarios where memory is sufficient.

CREATE INDEX idx_feature_novam_l2 ON chunks
USING ann(feature)
WITH (
    dim = 1536,
    algorithm = novam,
    distancemeasure = l2,
    hnsw_m = 32,
    hnsw_ef_construction = 200
);

Verify the index

After creating the index, confirm it exists:

SELECT indexname, indexdef
FROM pg_indexes
WHERE tablename = 'chunks';