All Products
Search
Document Center

Elasticsearch:Use the kNN search feature of Elasticsearch

Last Updated:Mar 26, 2026

Alibaba Cloud Elasticsearch V8.0 and later support k-nearest neighbor (kNN) search, which finds the vectors most similar to a query vector. kNN search powers use cases such as:

  • Image search and video fingerprinting: Find visually similar images or near-duplicate video segments

  • Facial and speech recognition: Match biometric vectors against stored profiles

  • Product recommendation: Retrieve items whose feature vectors are closest to a user's preference vector

How it works

Store your data as dense_vector fields in an Elasticsearch index. When you run a kNN search, Elasticsearch computes the similarity between your query vector and each document vector, then returns the k closest matches.

Two search methods are available:

Search method Query interface In-memory storage Mapping requirement
Approximate kNN search Search API with knn parameters Yes Set index: true on dense_vector fields
Exact kNN search script_score query with a vector function Yes Set index: false or leave it blank

In most cases, use approximate kNN search. It returns results with lower latency at the cost of slightly reduced accuracy. Exact kNN search guarantees precise results but performs a brute-force scan of all matched documents and is not suitable for large-scale datasets.

Prerequisites

Before you begin, ensure that you have:

  • An Alibaba Cloud Elasticsearch V8.x cluster. This guide uses V8.5.1 as an example. For setup instructions, see Create an Alibaba Cloud Elasticsearch cluster.

  • Business data converted into vector representations and stored in dense_vector fields. Design your vectors so that documents more similar to a query have vectors closer to the query vector, based on your chosen similarity metric.

Usage notes

  • dense_vector fields do not support aggregation or sorting.

  • nested type fields do not support approximate kNN search.

  • In cross-cluster search scenarios, the ccs_minimize_roundtrips parameter is not supported with kNN search.

  • kNN search uses the dfs_query_then_fetch search type by default. The search_type parameter cannot be set explicitly when running a kNN search.

Approximate kNN search

Performance considerations

Approximate kNN search uses Hierarchical Navigable Small World (HNSW) graphs to index dense vectors per segment. Building HNSW graphs during indexing is resource-intensive. To maintain stable performance:

  • Increase the client timeout period and use bulk write requests when indexing vector data.

  • Reduce the number of segments, or merge all segments into one, to improve search speed.

  • Make sure that data node memory exceeds the combined size of all vector data and the HNSW index structures.

  • Avoid large write or update operations while an approximate kNN search is running.

Create an index

Set index: true and configure the similarity parameter when defining dense_vector fields.

PUT image-index
{
  "mappings": {
    "properties": {
      "image-vector": {
        "type": "dense_vector",
        "dims": 3,
        "index": true,
        "similarity": "l2_norm"
      },
      "title": {
        "type": "text"
      },
      "file-type": {
        "type": "keyword"
      }
    }
  }
}

`dense_vector` field parameters

For the full parameter reference, see dense-vector.

Parameter Description
type Set to dense_vector.
dims Number of dimensions per vector. Maximum is 1024 when index: true, and 2048 when index: false.
index Set to true to build an HNSW index for approximate kNN search. Default: false.
similarity Similarity algorithm used to score matches. Required when index: true. See the table below.

Supported similarity algorithms

Value Algorithm Score formula
l2_norm Euclidean distance 1 / (1 + l2_norm(query, vector)^2)
dot_product (float) Dot product — both vectors must be normalized to unit length (1 + dot_product(query, vector)) / 2
dot_product (byte) Dot product — both vectors must have the same length 0.5 + (dot_product(query, vector) / (32768 × dims))
cosine Cosine similarity (1 + cosine(query, vector)) / 2
Important

The cosine algorithm does not support vectors with a value of 0.

Note

Approximate kNN search is supported in Elasticsearch V8.0 and later. If your cluster was upgraded from a version earlier than V8.0, recreate the index and set index: true on any dense_vector fields before running approximate kNN searches.

Write data

POST image-index/_bulk?refresh=true
{ "index": { "_id": "1" } }
{ "image-vector": [1, 5, -20], "title": "moose family", "file-type": "jpg" }
{ "index": { "_id": "2" } }
{ "image-vector": [42, 8, -15], "title": "alpine lake", "file-type": "png" }
{ "index": { "_id": "3" } }
{ "image-vector": [15, 11, 23], "title": "full moon", "file-type": "jpg" }

Run an approximate kNN search

Use the knn parameter in the Search API. The kNN search API was deprecated in Elasticsearch V8.4 and later.

POST image-index/_search
{
  "knn": {
    "field": "image-vector",
    "query_vector": [-5, 9, -12],
    "k": 10,
    "num_candidates": 100
  },
  "fields": [ "title", "file-type" ]
}

kNN search parameters

For the full parameter reference, see Search API.

Parameter Required Description
field Yes Name of the dense_vector field to search.
query_vector Yes Query vector. Must have the same number of dimensions as the target field.
k Yes Number of nearest neighbors to return. Must be less than num_candidates.
num_candidates Yes Number of nearest neighbor candidates evaluated per shard. Maximum: 10,000. A larger value improves accuracy but increases latency.
filter No Query DSL filter applied during the search. Returns the top k documents that match the filter. If omitted, all documents are considered.

Exact kNN search

Create an index

Set index: false (or leave it blank) to skip building an HNSW index.

PUT zl-index
{
  "mappings": {
    "properties": {
      "product-vector": {
        "type": "dense_vector",
        "dims": 5,
        "index": false
      },
      "price": {
        "type": "long"
      }
    }
  }
}
Parameter Description
type Set to dense_vector.
dims Number of dimensions per vector.
index Default: false. Set to false or leave blank to improve exact kNN search efficiency.

Write data

POST zl-index/_bulk?refresh=true
{ "index": { "_id": "1" } }
{ "product-vector": [230.0, 300.33, -34.8988, 15.555, -200.0], "price": 1599 }
{ "index": { "_id": "2" } }
{ "product-vector": [-0.5, 100.0, -13.0, 14.8, -156.0], "price": 799 }
{ "index": { "_id": "3" } }
{ "product-vector": [0.5, 111.3, -13.0, 14.8, -156.0], "price": 1099 }

Run an exact kNN search

Use the script_score query with a vector function. The example below uses cosineSimilarity and applies a filter to limit the documents evaluated by the vector function — reducing the number of documents scanned improves performance.

POST zl-index/_search
{
  "query": {
    "script_score": {
      "query": {
        "bool": {
          "filter": {
            "range": {
              "price": {
                "gte": 1000
              }
            }
          }
        }
      },
      "script": {
        "source": "cosineSimilarity(params.queryVector, 'product-vector') + 1.0",
        "params": {
          "queryVector": [-0.5, 90.0, -10, 14.8, -156.0]
        }
      }
    }
  }
}

Supported vector functions

For the full function reference, see Functions for vector fields.

Function Description
cosineSimilarity Cosine similarity between query vector and document vector
dotProduct Dot product of query vector and document vector
l1norm L1 distance (Manhattan distance) between query vector and document vector
l2norm L2 distance (Euclidean distance) between query vector and document vector

What's next