All Products
Search
Document Center

Elasticsearch:Alibaba Cloud Elasticsearch Vector Engine User Guide

Last Updated:Mar 26, 2026

The Alibaba Cloud Elasticsearch vector engine lets you run large-scale vector similarity search alongside full-text search in a single cluster. Use it to build recommendation systems, image retrieval pipelines, and natural language processing applications.

The vector engine stores HNSW indexes in off-heap memory. Before selecting a data node specification and count, estimate your off-heap memory usage using the memory calculation guidance later in this topic.
The vector engine is updated regularly. Use the latest version of Alibaba Cloud Elasticsearch 8.x for the best performance and cost efficiency.

Prerequisites

Before you begin, ensure that you have:

The vector engine uses large amounts of off-heap memory to cache vector indexes. To choose the appropriate data node specification and count, estimate off-heap memory usage using the memory calculation guidance later in this topic.

Choose a search method

Before creating an index, decide which kNN search method fits your use case:

Method How it works Best for
Approximate kNN Uses the knn clause with HNSW indexing for fast, scalable search Most production workloads
Exact kNN Uses a script_score query for brute-force scoring of every document Small datasets or precise scoring

Approximate kNN offers low latency and good accuracy. Exact kNN guarantees accurate results but does not scale to large datasets.

Step 1: Create an index

Create an index with a dense_vector field to store vector data:

PUT /my_vector_index
{
  "settings": {
    "number_of_shards": 1,
    "number_of_replicas": 1
  },
  "mappings": {
    "properties": {
      "my_vector": {
        "type": "dense_vector",
        "dims": 3
      },
      "my_text": {
        "type": "keyword"
      }
    }
  }
}

Key points:

  • Set dims to the output dimension of your embedding model. All documents in the index must use the same dimension.

  • Decide number_of_shards and number_of_replicas based on your data volume and performance requirements.

  • dense_vector supports additional parameters such as similarity metrics and index options. See Dense vector field type for the full list.

Step 2: Index documents

Add documents using the Bulk API or individual index requests. Each document's vector must match the dims value in the mapping:

PUT my_vector_index/_doc/1
{
  "my_text": "text1",
  "my_vector": [0.5, 10, 6]
}

PUT my_vector_index/_doc/2
{
  "my_text": "text2",
  "my_vector": [-0.5, 10, 10]
}
Important

If a document's vector dimension does not match dims in the mapping, Elasticsearch rejects the document with an error.

Step 3: Run a kNN search

Approximate kNN (recommended)

Submit a kNN query by specifying a query vector and the number of results to return:

GET my_vector_index/_search
{
  "knn": {
    "field": "my_vector",
    "query_vector": [-5, 9, -12],
    "k": 10,
    "num_candidates": 100
  },
  "fields": ["my_text"]
}
Parameter Description
k Number of nearest neighbors to return. Must be less than or equal to num_candidates. Defaults to the size value.
num_candidates Nearest neighbor candidates to collect per shard before merging results. Must be greater than k and less than or equal to 10,000. Defaults to Math.min(1.5 * k, 10000). A higher value improves recall at the cost of latency.
In HNSW terms, num_candidates maps to the ef (exploration factor) value at query time — it controls how many candidate documents each shard explores before returning its top results. k is the final number of documents returned across all shards.

Exact kNN

Use a script_score query with a vector function for brute-force scoring. This approach scans all documents in the index and does not scale to large datasets.

Additional kNN search capabilities

The knn clause supports the following features:

Feature Description
Filtering Add a filter clause to restrict results to a subset of documents before running the kNN search.
Minimum score threshold Use the similarity parameter to exclude documents below a minimum similarity score.
Nested fields Run kNN searches on vectors stored in nested fields.
Multi-field kNN Query multiple knn fields in a single request.
Rescoring Rerank approximate kNN results with a script rescore for higher precision.

For the complete feature reference, see k-nearest neighbor (kNN) search.

Production considerations

Memory planning

The vector engine stores HNSW indexes in off-heap memory. Underestimating memory causes evictions and degrades search latency. Size your data nodes before going to production using the memory calculation guidance in this topic.

`num_candidates` tuning

Start with the default value (Math.min(1.5 * k, 10000)). If recall is below your target, increase num_candidates incrementally. A larger value improves recall at greater performance cost.

Shard planning

Set number_of_shards and number_of_replicas based on your expected data volume, query throughput, and availability requirements.

Dimension consistency

Set dims to the exact output dimension of your embedding model. Dimension mismatches cause indexing errors.

Keep Elasticsearch up to date

The vector engine receives ongoing improvements in performance and cost efficiency. Run the latest Alibaba Cloud Elasticsearch 8.x version to benefit from these updates.