Build Hybrid Search with Vector, Full-Text & RRF Reranking - PolarDB

Hybrid search combines vector search with full-text search to return results that are both semantically relevant and keyword-accurate.

Vector search finds documents by meaning and handles paraphrasing and synonyms well, but it can miss exact terms such as product names, error codes, or version numbers. Full-text search matches keywords precisely but has no understanding of intent or context. Hybrid search runs both in parallel and merges the ranked lists, capturing what each method alone would miss.

When to use hybrid search

Use hybrid search when queries mix natural language with specific terms — for example, "how to enable RBAC in version 1.30" or "Python SDK connection timeout error." Pure vector search or pure full-text search handles each of those elements well in isolation, but only hybrid search handles them together.

Scenario	Recommended approach
Purely keyword-based queries (exact error codes, product SKUs, IDs)	Full-text search alone
Purely conversational or semantic queries	Vector search alone
Queries that mix natural language with specific terms	Hybrid search

Run a hybrid search query

The following example queries a table named items, filters rows where the textsearch column matches the query, and ranks results by full-text relevance:

SELECT id, content FROM items, plainto_tsquery('hello search') query
    WHERE textsearch @@ query ORDER BY ts_rank_cd(textsearch, query) DESC LIMIT 5;

This query returns only the rows that satisfy the full-text match condition. To incorporate vector similarity results, run a separate vector search and then merge the two result sets using one of the approaches below.

Merge results

After running a vector search and a full-text search independently, merge their ranked lists into a single result. Two approaches are available: Reciprocal Rank Fusion (RRF) and cross-encoder reranking.

Reciprocal Rank Fusion (RRF)

RRF combines results based on each item's rank position across both lists, not on the raw scores. For each item, the score is the sum of 1 / (k + rank) across both lists, where k is a smoothing constant (typically 60) that prevents top-ranked items from dominating. Items that rank well in both lists accumulate higher scores and rise to the top.

For example, if a document ranks third in the full-text list and ninth in the vector list, its RRF score is 1 / (60 + 3) + 1 / (60 + 9) ≈ 0.030.

RRF requires no additional model and works well for most production use cases. Start with RRF when you need to merge results at query time without adding inference latency.

A Python example is available at pgvector-python/examples/hybrid_search/rrf.py.

Cross-encoder reranking

A cross-encoder is a neural model that scores each candidate document by jointly encoding the query and the document text. Unlike RRF, which relies only on rank positions, a cross-encoder reads the actual content and assigns a relevance score based on meaning.

Cross-encoder reranking produces higher-quality rankings but adds inference latency. Run it on a small candidate set (for example, the top 20–50 results from each search) rather than the full table.

A Python example is available at pgvector-python/examples/hybrid_search/cross_encoder.py.

Choose an approach

	RRF	Cross-encoder
Input	Rank positions from each list	Query + document text
Inference required	No	Yes (neural model)
Latency	Low	Higher
Best for	General-purpose merging at scale	High-precision reranking on a small candidate set

Start with RRF. Switch to a cross-encoder if ranking quality is critical and your candidate set is small enough to keep latency acceptable.