All Products
Search
Document Center

Elasticsearch:FalconSeek vector index user guide

Last Updated:Dec 11, 2025

This document describes the advanced usage of FalconSeek for vector retrieval. It explains how to select algorithms, such as HNSW and RabitQGraph, to accelerate queries for specific needs, including cost-effective storage for large-scale data and extreme performance optimization. This guide covers algorithm selection, parameter settings, index management, and dynamic tuning techniques, and provides complete code examples to help you implement efficient vector retrieval.

Background information

FalconSeek enhances the original index structure of Alibaba Cloud ES by adding a new C++-based vector engine index. The FalconSeek vector index is developed by Alibaba and supports major services within Alibaba Group, such as Taobao, Tmall Search, Recommendation, and Pailitao (search by image). You can use the high-performance vector index integrated into the FalconSeek kernel to build efficient AI applications, such as applications for search by image and semantic search.

The FalconSeek vector index is fully compatible with open source ES vector engine features, such as k-Nearest Neighbors (k-NN) search. To use FalconSeek in an Alibaba Cloud ES index, you can set index_options.type to havenask_native in the index configuration. For more information about advanced parameter settings, see the detailed descriptions that follow.

Usage examples

Basic example

The following example shows how to create an index to store and retrieve vector data. The index is named my_falcon_seek_index. It contains a vector field named product_vector with 128 dimensions and uses the HNSW algorithm.

  1. Create an index.

    PUT /my_falcon_seek_index
    {
      "settings": {
        "number_of_shards": 1,
        "number_of_replicas": 0
      },
      "mappings": {
        "properties": {
          "product_vector": {
            "type": "dense_vector",
            "dims": 128,
            "index": true,
            "similarity": "l2_norm",
            "index_options": {
              "type": "havenask_native",
              "knn_type": "HNSW",
              "m": 32,
              "ef_construction": 400
            }
          },
          "category": {
            "type": "keyword" 
          }
        }
      }
    }

    Core parameters

    • The dense_vector field type is used to store dense vector data. When you define a field of this type in mappings, you must specify the following core properties:

      • dims: The dimension of the vector.

      • similarity: The function to calculate the similarity between vectors.

      • index_options: The detailed configuration for the vector index, including the algorithm type and related parameters.

    • The similarity function is used to measure the degree of similarity between two vectors. Choosing the right function is critical for recall performance.

      Function

      Description

      Scenarios

      l2_norm

      Euclidean distance. This function calculates the straight-line distance between two vectors in a multi-dimensional space. A smaller distance indicates greater similarity.

      General scenarios, such as image recognition and facial recognition.

      cosine

      Cosine similarity. This function calculates the cosine of the angle between two vector directions. A value closer to 1 indicates greater similarity.

      Text semantic similarity analysis. This function is not affected by vector length.

      dot_product

      Inner product. This function calculates the dot product of two vectors. A larger value indicates greater similarity.

      Suitable for scenarios that need to consider vector magnitude, such as recommendation systems.

      max_inner_product

      Same as dot_product, but does not require vector normalization.

      Suitable for scenarios that need to consider vector magnitude, such as recommendation systems.

  2. Write data. You can write documents that contain vectors and metadata, such as product categories, to the index. The length of the product_vector array must be the same as the dimension defined by dims (128).

    POST /my_falcon_seek_index/_doc/1
    {
      "product_vector": [0.12, -0.05, 0.08, 0.24, -0.17, 0.31, 0.02, -0.19, 0.11, 0.28,
        -0.03, 0.15, 0.22, -0.11, 0.09, 0.33, -0.07, 0.14, 0.26, -0.21,
        0.18, 0.29, -0.13, 0.06, 0.35, -0.08, 0.16, 0.23, -0.15, 0.12, 
        0.27, -0.22, 0.19, 0.32, -0.14, 0.07, 0.25, -0.18, 0.13, 0.30,
        -0.09, 0.17, 0.24, -0.16, 0.10, 0.34, -0.10, 0.20, 0.31, -0.23,
        0.15, 0.28, -0.12, 0.11, 0.26, -0.19, 0.14, 0.29, -0.17, 0.08,
        0.22, -0.20, 0.16, 0.27, -0.15, 0.09, 0.25, -0.21, 0.18, 0.30,
        -0.13, 0.07, 0.24, -0.22, 0.19, 0.32, -0.16, 0.10, 0.26, -0.18,
        0.12, 0.28, -0.14, 0.06, 0.23, -0.19, 0.15, 0.29, -0.11, 0.05,
        0.21, -0.17, 0.13, 0.27, -0.10, 0.04, 0.20, -0.15, 0.11, 0.25,
        -0.09, 0.03, 0.19, -0.13, 0.10, 0.24, -0.08, 0.02, 0.18, -0.12,
        0.09, 0.23, -0.07, 0.01, 0.17, -0.11, 0.08, 0.22, 0.05, -0.06,
        0.09, 0.23, -0.07, 0.01, 0.17, -0.11, 0.08, 0.22],
      "category": "clothes"
    }
    POST /my_falcon_seek_index/_doc/2
    {"product_vector":[0.12, -0.05, 0.08, 0.24, -0.17, 0.31, 0.02, -0.19, 0.11, 0.28,
        -0.03, 0.15, 0.22, -0.11, 0.09, 0.33, -0.07, 0.14, 0.26, -0.21,
        0.18, 0.29, -0.13, 0.06, 0.35, -0.08, 0.16, 0.23, -0.15, 0.12, 
        0.27, -0.22, 0.19, 0.32, -0.14, 0.07, 0.25, -0.18, 0.13, 0.30,
        -0.09, 0.17, 0.24, -0.16, 0.10, 0.34, -0.10, 0.20, 0.31, -0.23,
        0.15, 0.28, -0.12, 0.11, 0.26, -0.19, 0.14, 0.29, -0.17, 0.08,
        0.22, -0.20, 0.16, 0.27, -0.15, 0.09, 0.25, -0.21, 0.18, 0.30,
        -0.13, 0.07, 0.24, -0.22, 0.19, 0.32, -0.16, 0.10, 0.26, -0.18,
        0.12, 0.28, -0.14, 0.06, 0.23, -0.19, 0.15, 0.29, -0.11, 0.05,
        0.21, -0.17, 0.13, 0.27, -0.10, 0.04, 0.20, -0.15, 0.11, 0.25,
        -0.09, 0.03, 0.19, -0.13, 0.10, 0.24, -0.08, 0.02, 0.18, -0.12,
        0.09, 0.23, -0.07, 0.01, 0.17, -0.11, 0.08, 0.22, 0.05, -0.06,
        0.09, 0.23, -0.07, 0.01, 0.17, -0.11, 0.08, 0.22],"category": "clothes"}
  3. Vector retrieval (k-NN). You can find the 5 most similar documents based on a given query vector.

    GET /my_falcon_seek_index/_search
    {
      "knn": {
        "field": "product_vector",
        "query_vector": [0.12, -0.05, 0.01, 0.24, -0.17, 0.31, 0.02, -0.19, 0.11, 0.28,
        -0.03, 0.15, 0.22, -0.11, 0.09, 0.23, -0.07, 0.14, 0.26, -0.21,
        0.18, 0.29, -0.13, 0.06, 0.35, -0.18, 0.16, 0.23, -0.15, 0.12, 
        0.27, -0.22, 0.19, 0.32, -0.14, 0.87, 0.25, -0.18, 0.13, 0.30,
        -0.09, 0.17, 0.24, -0.16, 0.10, 0.64, -0.10, 0.20, 0.31, -0.23,
        0.15, 0.28, -0.12, 0.11, 0.26, -0.19, 0.14, 0.29, -0.17, 0.08,
        0.22, -0.20, 0.16, 0.27, -0.15, 0.09, 0.25, -0.21, 0.18, 0.30,
        -0.13, 0.07, 0.24, -0.22, 0.19, 0.52, -0.16, 0.10, 0.26, -0.18,
        0.12, 0.28, -0.14, 0.06, 0.23, -0.19, 0.14, 0.29, -0.11, 0.05,
        0.21, -0.17, 0.13, 0.27, -0.10, 0.04, 0.20, -0.15, 0.11, 0.25,
        -0.09, 0.03, 0.19, -0.13, 0.10, 0.24, -0.08, 0.02, 0.18, -0.12,
        0.09, 0.23, -0.07, 0.01, 0.17, -0.11, 0.08, 0.22, 0.05, -0.06,
        0.09, 0.23, -0.07, 0.01, 0.17, -0.11, 0.08, 0.12],
        "k": 5,
        "num_candidates": 100
      }
    }

    Core parameters

    The knn query is used to perform a k-NN search.

    • field: The name of the dense_vector field to query.

    • query_vector: The vector used for the query. Its dimension must match the field definition.

    • k: The number of most similar results to return.

    • num_candidates: The size of the candidate set that the algorithm searches internally on each shard. This value must be greater than k and is usually a multiple of k. A larger value increases the recall rate but also increases query latency.

  4. Filtered retrieval. You can filter the results during vector retrieval to meet more complex business requirements. The following example finds the 5 most similar documents in the "shoes" category.

    GET /my_falcon_seek_index/_search
    {
      "knn": {
        "field": "product_vector",
        "query_vector": [
          0.12, -0.05, 0.01, 0.24, -0.17, 0.31, 0.02, -0.19, 0.11, 0.28,
          -0.03, 0.15, 0.22, -0.11, 0.09, 0.23, -0.07, 0.14, 0.26, -0.21,
          0.18, 0.29, -0.13, 0.06, 0.35, -0.18, 0.16, 0.23, -0.15, 0.12,
          0.27, -0.22, 0.19, 0.32, -0.14, 0.87, 0.25, -0.18, 0.13, 0.30,
          -0.09, 0.17, 0.24, -0.16, 0.10, 0.64, -0.10, 0.20, 0.31, -0.23,
          0.15, 0.28, -0.12, 0.11, 0.26, -0.19, 0.14, 0.29, -0.17, 0.08,
          0.22, -0.20, 0.16, 0.27, -0.15, 0.09, 0.25, -0.21, 0.18, 0.30,
          -0.13, 0.07, 0.24, -0.22, 0.19, 0.52, -0.16, 0.10, 0.26, -0.18,
          0.12, 0.28, -0.14, 0.06, 0.23, -0.19, 0.14, 0.29, -0.11, 0.05,
          0.21, -0.17, 0.13, 0.27, -0.10, 0.04, 0.20, -0.15, 0.11, 0.25,
          -0.09, 0.03, 0.19, -0.13, 0.10, 0.24, -0.08, 0.02, 0.18, -0.12,
          0.09, 0.23, -0.07, 0.01, 0.17, -0.11, 0.08, 0.22, 0.05, -0.06,
          0.09, 0.23, -0.07, 0.01, 0.17, -0.11, 0.08, 0.12
        ],
        "k": 5,
        "num_candidates": 100,
        "filter": {
          "term": {
            "category": "shoes"
          }
        }
      }
    }

Extended feature: tags_filter filtered retrieval

The tags_filter parameter provides a pre-filtering mechanism that is optimized for the HNSW and QGraph algorithms. It offers higher performance than the standard bool filter because it excludes non-matching nodes early in the graph traversal process.

  • Scenarios: You can use this feature when the filtering field has a limited number of unique values, such as a category or brand ID, and you require high query performance.

  • How to enable this feature:

    1. In the index_options of the mappings, you can declare the keyword type field to be used for filtering with the tags parameter.

      PUT /my_vector_index_with_tags
      {
        "mappings": {
          "properties": {
            "product_vector": {
              "type": "dense_vector",
              "dims": 128,
              "index_options": {
                "type": "havenask_native",
                "knn_type": "HNSW",
                "tags": ["category"] // Declare the category field for tags_filter
              }
            },
            "category": {
              "type": "keyword" // Must be of keyword type
            }
          }
        }
      }
    2. In the knn query, you can use the tags_filter parameter with the "field_name = value" syntax. This parameter supports | (OR) and & (AND) logic.

      GET /my_vector_index_with_tags/_search
      {
        "knn": {
          "field": "product_vector",
          "query_vector": [...],
          "k": 5,
          "tags_filter": "category = shoes | category = socks"
        }
      }
      

Algorithm selection and performance guide

Choosing the right knn_type algorithm is critical for achieving the best balance of performance, cost, and precision.

Algorithm comparison and recommendations

Algorithm

Recall rate

Query speed

Memory usage

Scenarios and recommendations

Core limitations

HNSW

High

Fast

Medium

General-purpose first choice.

Suitable for most scenarios, providing the best balance between recall rate, performance, and resource consumption. It performs well with datasets ranging from 100,000 to 10 million documents.

No special limitations.

RabitQGraph

High

Fastest

Lowest

For extreme performance and cost-sensitive scenarios.

Suitable for datasets that require millisecond-level query latency or are extremely sensitive to memory costs, especially for very large datasets (over 10 million documents).

Only supports l2_norm similarity. Vector dimensions must be a positive integer multiple of 64.

QGraph

High

Fast

Low

For large-scale, storage cost-sensitive scenarios.

Significantly reduces memory and storage usage through vector quantization. Suitable for datasets with over 5 million documents.

No special limitations.

QC

Medium

Medium

Low

For large-scale, memory-constrained scenarios. Consider this when build time is not sensitive, but runtime memory resources are very tight. Suitable for datasets with over 1 million documents.

Longer build time.

Linear

Highest (100%)

Slow

Low

For small datasets or scenarios requiring absolute precision.

Suitable for scenarios where the total number of documents is less than brute_force_threshold (default 1000) or linear_build_threshold. The system switches automatically. Query time increases linearly with data volume.

Only suitable for small datasets.

Scenario-based selection path

  • If you are just starting or are unsure which algorithm to use, choose HNSW.

  • If the data volume grows and performance degrades:

    • If you have sufficient memory and want to achieve higher Queries Per Second (QPS), you can migrate from HNSW to RabitQGraph if the requirements for `RabitQGraph` are met.

    • If memory or storage costs are a concern, you can migrate from HNSW to QGraph and configure a quantizer.

  • For very large datasets with over 10 million documents:

    • If you have extremely high latency requirements, choose RabitQGraph.

    • If you are cost-sensitive and have slightly lower latency requirements, choose QGraph.

Index management (Mappings)

When you create an index, all vector-related configurations are set within the index_options block of the mappings.

PUT /<your_index_name>
{
  "mappings": {
    "properties": {
      "<your_vector_field>": {
        "type": "dense_vector",
        "dims": 768,
        "similarity": "l2_norm",
        "index_options": {
          // --- General parameters ---
          "type": "havenask_native",
          "knn_type": "HNSW",
          // --- Algorithm build parameters ---
          "m": 32,
          "ef_construction": 400,
          // --- Advanced parameters ---
          "thread_count": 8
        }
      }
    }
  }
}

General parameters

Parameter

Description

Type

Required

Default

type

Specifies the FalconSeek vector index engine. Must be set to "havenask_native".

String

Yes

None

knn_type

Specifies the vector index algorithm. Options are "HNSW", "RabitQGraph", "QGraph", "QC", and "Linear".

String

No

"HNSW"

Algorithm build parameters

The following parameters take effect during index construction and determine the structure and quality of the index.

Common parameters for HNSW and QGraph

Parameter

Description

Type

Default

Recommended values and impact

m

The maximum number of neighbors for each node in the graph.

Integer

16

Impact: Directly affects recall rate and memory usage.
Recommendation: 16 (low memory), 32 (balanced), 64-128 (high recall rate). A larger value increases the recall rate, but also increases memory usage and build time. The value range is 4-128.

ef_construction

The width of the candidate set search during graph construction.

Integer

200

Impact: Determines the quality and time of index construction.
Recommendation: 200 (fast build), 400-500 (balanced), 800+ (high-quality build). A larger value results in higher index quality (and ultimately a higher recall rate), but a longer build time. The value range is 10-2000.

QGraph-specific parameters

Parameter

Description

Type

Default

Recommended values and impact

quantizer

The vector quantization method, used to compress vectors to reduce memory and storage usage.

String

None

Impact: Significantly reduces resource usage, but may cause a slight loss of precision.
Recommendation: "int8" (recommended, 8x compression), "int4" (higher compression ratio), "fp16" (high precision, 2x compression), "2bit" (extreme compression).

Advanced parameters

Parameter

Description

Type

Default

Recommended values and impact

thread_count

The number of parallel threads used during index construction.

Integer

1

Impact: Speeds up the build process.
Recommendation: Set to 0 to automatically use all CPU cores of the machine, or specify a value (1-32) based on your resources.

tags

Declares a list of fields for high-performance filtering with tags_filter.

Array

[]

Impact: Enables high-performance pre-filtering.
Recommendation: Add frequently filtered keyword fields with low cardinality to the list, such as ["category", "brand_id"].

linear_build_threshold

When the total number of documents in the index is below this threshold, the system automatically switches to Linear (brute-force) search to optimize resource consumption and query efficiency for small datasets.

Integer

0

Impact: Avoids building complex index structures for small datasets.
Recommendation: For indexes that may have very few documents, you can set this to 1000 or 5000.

index_params

A configuration interface for the underlying engine's parameters, used for advanced tuning.

Object

{}

Impact: Allows fine-grained control over every detail of the build and query processes.
Important: Parameters within index_params will overwrite top-level parameters with the same name (such as m and ef_construction). This is an interface for advanced users. Regular users should use the top-level parameters.

The index_params parameter provides direct access to the underlying proxima.* and param.* parameters. For example, the top-level m parameter corresponds to the underlying proxima.hnsw.builder.max_neighbor_count. You should use this configuration only when you need to adjust details that are not exposed by the top-level parameters.

json

"index_options": {  "knn_type": "HNSW",  "m": 32, // This will be overwritten by index_params below  "index_params": {    "proxima.hnsw.builder.max_neighbor_count": 48 // The final effective value is 48  }}

The knn query body

Parameter

Description

Type

Required

field

The name of the dense_vector field to perform the k-NN search on.

String

Yes

query_vector

The vector used for the query.

Array

Yes

k

The number of most similar results to return.

Integer

Yes

num_candidates

The size of the search candidate set on each shard. A larger value increases the recall rate but slows down the query.

Integer

No (Recommended)

tags_filter

(Optional) High-performance pre-filtering for HNSW and QGraph.

String

No

search_params

(Optional) Dynamically adjust some search parameters at query time for temporary tuning.

Object

No

Parameter description

HNSW

The parameters for the Hierarchical Navigable Small Worlds (HNSW) algorithm use the proxima.hnsw. namespace and can be set using index_params.

HNSW Builder

Parameter name

Type

Default

Description

proxima.hnsw.builder.max_neighbor_count

uint32

100

The number of neighbors in the graph. A larger value makes the graph more accurate but increases computation and storage overhead. It generally should not exceed the feature dimension. The maximum is 65535.

proxima.hnsw.builder.efconstruction

uint32

500

Controls the precision of graph construction. A larger value results in a more accurate graph but takes longer to build.

proxima.hnsw.builder.thread_count

uint32

0

The number of threads to use during construction. If set to 0, it uses the number of CPU cores.

proxima.hnsw.builder.memory_quota

uint64

0

Limits the maximum memory for construction. Disk-based construction is not currently supported. If this value is exceeded, the build will fail.

proxima.hnsw.builder.scaling_factor

uint32

50

The ratio of nodes between graph layers. Generally does not need to be modified. The value range is [5,1000].

proxima.hnsw.builder.neighbor_prune_ratio

float

0.5

Controls the number of neighbors at which edge pruning begins in the neighbor table. Generally does not need to be modified.

proxima.hnsw.builder.upper_neighbor_ratio

float

0.5

The ratio of neighbors in the upper layer of the graph relative to the layer 0 graph. Generally does not need to be modified.

proxima.hnsw.builder.enable_adsampling

bool

false

Disabled by default. Currently only supports Euclidean distance calculation for fp32 datasets. Not recommended for dimensions below 256.

proxima.hnsw.builder.slack_pruning_factor

float

1.0

Default is 1.0. Recommended to be between [1.1, 1.2]. For gist960 and sift128, 1.1 is recommended.

HNSW Searcher

Parameter name

Type

Default

Description

proxima.hnsw.searcher.ef

uint32

500

Used to control retrieval precision. A larger value scans more documents and increases the recall rate.

proxima.hnsw.searcher.max_scan_ratio

float

0.1

Controls the maximum proportion of documents to scan during retrieval. If the ef value converges early, this ratio may not be reached.

proxima.hnsw.searcher.neighbors_in_memory_enable

bool

false

If enabled, keeps the neighbor table in memory, which improves performance but consumes more memory.

proxima.hnsw.searcher.check_crc_enable

bool

false

Specifies whether to perform a cyclic redundancy check (CRC) on the index. Enabling this will increase load time.

proxima.hnsw.searcher.visit_bloomfilter_enable

bool

false

Uses a bloom filter as the container for deduplicating visited graph nodes. This optimizes memory but slightly degrades performance.

proxima.hnsw.searcher.visit_bloomfilter_negative_prob

float

0.001

The accuracy of the bloom filter. A smaller value is more accurate but uses more memory.

proxima.hnsw.searcher.brute_force_threshold

int

1000

If the total number of documents is less than this value, a linear search is performed.

HNSW configuration examples

// Basic HNSW configuration
PUT /hnsw_basic
{
  "mappings": {
    "properties": {
      "vector": {
        "type": "dense_vector",
        "dims": 768,
        "index": true,
        "similarity": "cosine",
        "index_options": {
          "type": "havenask_native",
          "knn_type": "HNSW"
        }
      }
    }
  }
}

// High-performance HNSW configuration
PUT /hnsw_performance
{
  "mappings": {
    "properties": {
      "vector": {
        "type": "dense_vector",
        "dims": 1024,
        "index": true,
        "similarity": "dot_product",
        "index_options": {
          "type": "havenask_native",
          "knn_type": "HNSW",
          "m": 48,
          "ef_construction": 500,
          "thread_count": 8,
          "linear_build_threshold": 1000,
          "is_embedding_saved": true,
          "embedding_load_strategy": "ANN_INDEX_FILE",
          "index_load_strategy": "MEM"
        }
      }
    }
  }
}

RabitQGraph

RabitQGraph Builder

Parameter name

Type

Default

Description

param.rabitQGraph.builder.neighbor_cnt

uint32

128

The number of neighbors for each node, affecting graph connectivity and search precision.

param.rabitQGraph.builder.ef_construction

uint32

512

The EF parameter for building, which controls the number of candidate nodes during construction.

param.rabitQGraph.builder.prune_ratio

float

0.5

Neighbor pruning ratio, used to optimize the graph structure.

param.rabitQGraph.builder.cluster_count

uint32

64

The number of cluster centroids, used for vector quantization.

param.rabitQGraph.builder.quantized_bit_count

uint32

1

The number of bits for quantization. Can only be set to 1, 4, 5, 8, or 9.

param.rabitQGraph.builder.slack_prune_factor

float

1.0

Slack pruning factor, used to control the pruning policy.

param.rabitQGraph.builder.repair_connectivity

bool

true

Specifies whether to repair graph connectivity.

param.rabitQGraph.builder.thread_count

uint32

0

The number of threads used during construction. If set to 0, it uses the number of CPU cores.

param.rabitQGraph.builder.ckpt_count

uint32

0

The number of checkpoints, used for incremental builds.

param.rabitQGraph.builder.ckpt_threshold

uint32

2000000

Checkpoint threshold.

RabitQGraph Searcher

Parameter name

Type

Default

Description

param.rabitQGraph.searcher.ef

uint32

250

The EF parameter for searching, which affects search precision and performance.

param.rabitQGraph.searcher.max_scan_ratio

double

0.05

Maximum scan ratio, which limits the percentage of nodes to search.

param.rabitQGraph.searcher.check_crc_enable

bool

false

Specifies whether to enable CRC check.

param.rabitQGraph.searcher.thread_count

uint32

1

The number of threads to use for searching.

param.rabitQGraph.searcher.thread_safe_filter

bool

false

Specifies whether to enable thread-safe filtering.

RabitQGraph configuration examples

// High-performance configuration
{
  "index_params": {
    "param.rabitQGraph.builder.neighbor_cnt": 256,
    "param.rabitQGraph.builder.ef_construction": 512,
    "param.rabitQGraph.builder.quantized_bit_count": 8,
    "param.rabitQGraph.builder.cluster_count": 128,
    "param.rabitQGraph.builder.thread_count": 8,
    "param.rabitQGraph.searcher.ef": 300,
    "param.rabitQGraph.searcher.max_scan_ratio": 0.1
  }
}

// Memory-optimized configuration
{
  "index_params": {
    "param.rabitQGraph.builder.neighbor_cnt": 64,
    "param.rabitQGraph.builder.ef_construction": 200,
    "param.rabitQGraph.builder.quantized_bit_count": 1,
    "param.rabitQGraph.builder.cluster_count": 32,
    "param.rabitQGraph.searcher.ef": 150,
    "param.rabitQGraph.searcher.max_scan_ratio": 0.03
  }
}

// Balanced configuration
{
  "index_params": {
    "param.rabitQGraph.builder.neighbor_cnt": 128,
    "param.rabitQGraph.builder.ef_construction": 400,
    "param.rabitQGraph.builder.quantized_bit_count": 4,
    "param.rabitQGraph.builder.cluster_count": 64,
    "param.rabitQGraph.searcher.ef": 250
  }
}

Linear

The Linear algorithm performs a linear brute-force search. Its parameters are relatively simple and use the proxima.linear. namespace.

Linear Builder/Searcher

Parameter name

Type

Default

Description

proxima.linear.builder.column_major_order

string

false

Specifies whether to use row-major (false) or column-major (true) order for features during construction.

proxima.linear.searcher.read_block_size

uint32

1048576

The size of the block to read into memory at one time during the search phase. The recommended value is 1 MB.

Linear configuration examples

// Basic configuration
{
  "index_params": {
    "proxima.linear.builder.column_major_order": "false",
    "proxima.linear.searcher.read_block_size": 1048576
  }
}

// Large memory configuration
{
  "index_params": {
    "proxima.linear.builder.column_major_order": "true",
    "proxima.linear.searcher.read_block_size": 2097152
  }
}

QC

The QC (Quantization Clustering) algorithm uses a quantization clustering index. Its parameters use the proxima.qc. namespace.

QC Builder

Parameter name

Type

Default

Description

proxima.qc.builder.train_sample_count

uint32

0

Specifies the amount of training data. If 0, all data is used.

proxima.qc.builder.thread_count

uint32

0

The number of threads to use during construction. If set to 0, it uses the number of CPU cores.

proxima.qc.builder.centroid_count

string

-

Cluster centroid parameter. Supports hierarchical clustering. Separate layers with "*".

proxima.qc.builder.cluster_class

string

OptKmeansCluster

Specifies the clustering method.

proxima.qc.builder.cluster_auto_tuning

bool

false

Specifies whether to enable automatic tuning of the number of centroids.

proxima.qc.builder.optimizer_class

string

HcBuilder

The optimizer for the centroid part, used to improve precision during classification.

proxima.qc.builder.optimizer_params

IndexParams

-

The build and retrieval parameters corresponding to the optimize method.

proxima.qc.builder.converter_class

string

-

If Measure is InnerProduct, an Mips transformation is automatically performed.

proxima.qc.builder.converter_params

IndexParams

-

Initialization parameters for converter_class.

proxima.qc.builder.quantizer_class

string

-

Configures the quantizer. Options include Int8QuantizerConverter, Int4QuantizerConverter, etc.

proxima.qc.builder.quantizer_params

IndexParams

-

Parameters related to the quantizer.

proxima.qc.builder.quantize_by_centroid

bool

false

When using quantizer_class, specifies whether to quantize by centroid.

proxima.qc.builder.store_original_features

bool

false

Specifies whether to store the original features.

QC Searcher

Parameter name

Type

Default

Description

proxima.qc.searcher.scan_ratio

float

0.01

Used to calculate max_scan_num: total doc count * scan_ratio.

proxima.qc.searcher.optimizer_params

IndexParams

-

Specifies the online retrieval parameters corresponding to the optimizer used during the build.

proxima.qc.searcher.brute_force_threshold

int

1000

If the total number of documents is less than this value, a linear search is performed.

QC configuration examples

// Basic configuration
{
  "index_params": {
    "proxima.qc.builder.thread_count": 4,
    "proxima.qc.builder.centroid_count": "1000",
    "proxima.qc.builder.cluster_class": "OptKmeansCluster",
    "proxima.qc.searcher.scan_ratio": 0.02
  }
}

// Hierarchical clustering configuration
{
  "index_params": {
    "proxima.qc.builder.thread_count": 8,
    "proxima.qc.builder.centroid_count": "100*100",
    "proxima.qc.builder.optimizer_class": "HnswBuilder",
    "proxima.qc.builder.quantizer_class": "Int8QuantizerConverter",
    "proxima.qc.searcher.scan_ratio": 0.01
  }
}

// High-precision configuration
{
  "index_params": {
    "proxima.qc.builder.thread_count": 12,
    "proxima.qc.builder.centroid_count": "2000",
    "proxima.qc.builder.train_sample_count": 100000,
    "proxima.qc.builder.store_original_features": true,
    "proxima.qc.searcher.scan_ratio": 0.05
  }
}

QGraph

The QGraph (Quantized Graph) algorithm uses a quantized graph index. It inherits most of the parameters from HNSW and adds quantization-related parameters.

QGraph Builder

QGraph inherits all HNSW Builder parameters and adds the following:

Parameter name

Type

Default

Description

proxima.qgraph.builder.quantizer_class

string

-

Configures the quantizer. Options include Int8QuantizerConverter, Int4QuantizerConverter, HalfFloatConverter, and DoubleBitConverter.

proxima.qgraph.builder.quantizer_params

IndexParams

-

Configures parameters related to the quantizer.

All proxima.hnsw.builder.* parameters also apply to QGraph.

QGraph Searcher

QGraph inherits all HNSW Searcher parameters.

QGraph configuration examples

// Int8 quantization configuration
{
  "index_params": {
    "proxima.hnsw.builder.max_neighbor_count": 32,
    "proxima.hnsw.builder.efconstruction": 400,
    "proxima.hnsw.builder.thread_count": 4,
    "proxima.qgraph.builder.quantizer_class": "Int8QuantizerConverter",
    "proxima.qgraph.builder.quantizer_params": {},
    "proxima.hnsw.searcher.ef": 300
  }
}

// Int4 quantization configuration (more memory-efficient)
{
  "index_params": {
    "proxima.hnsw.builder.max_neighbor_count": 48,
    "proxima.hnsw.builder.efconstruction": 500,
    "proxima.hnsw.builder.thread_count": 6,
    "proxima.qgraph.builder.quantizer_class": "Int4QuantizerConverter",
    "proxima.qgraph.builder.quantizer_params": {},
    "proxima.hnsw.searcher.ef": 400,
    "proxima.hnsw.searcher.max_scan_ratio": 0.1
  }
}

// HalfFloat quantization configuration (high precision)
{
  "index_params": {
    "proxima.hnsw.builder.max_neighbor_count": 64,
    "proxima.hnsw.builder.efconstruction": 600,
    "proxima.hnsw.builder.thread_count": 8,
    "proxima.qgraph.builder.quantizer_class": "HalfFloatConverter",
    "proxima.qgraph.builder.quantizer_params": {},
    "proxima.hnsw.searcher.ef": 500
  }
}

Dynamic parameter adjustment with search_params

You can temporarily adjust certain search parameters without rebuilding the index to balance the recall rate and query latency in different scenarios.

For example, for normal online queries, you can use the default or a lower ef value to ensure low latency. For batch analytics or tasks that require high precision, you can temporarily increase the ef value using search_params to obtain a higher recall rate.

HNSW

{
  "search_params": {
    "proxima.hnsw.searcher.ef": "400",
    "proxima.hnsw.searcher.max_scan_ratio": "0.15"
  }
}

QGraph

{
  "search_params": {
    "proxima.hnsw.searcher.ef": "400",
    "proxima.hnsw.searcher.max_scan_ratio": "0.15"
  }
}

QC

{
  "search_params": {
    "proxima.qc.searcher.scan_ratio": "0.02"
  }
}

RabitQGraph

{
  "search_params": {
    "param.rabitQGraph.searcher.ef": "300",
    "param.rabitQGraph.searcher.max_scan_ratio": "0.08"
  }
}

search_params configuration examples

// Dynamically adjust precision and performance for HNSW
GET vector_index/_search
{
  "knn": {
    "field": "vector",
    "query_vector": [0.1, 0.2, 0.3],
    "k": 10,
    "num_candidates": 100,
    "search_params": {
      "proxima.hnsw.searcher.ef": "500",
      "proxima.hnsw.searcher.max_scan_ratio": "0.2"
    }
  }
}

// Dynamically adjust scan ratio for QC
GET vector_index/_search
{
  "knn": {
    "field": "vector",
    "query_vector": [0.1, 0.2, 0.3],
    "k": 10,
    "num_candidates": 100,
    "search_params": {
      "proxima.qc.searcher.scan_ratio": "0.05"
    }
  }
}

// Dynamically adjust search precision for RabitQGraph
GET vector_index/_search
{
  "knn": {
    "field": "vector",
    "query_vector": [0.1, 0.2, 0.3],
    "k": 10,
    "num_candidates": 100,
    "search_params": {
      "param.rabitQGraph.searcher.ef": "400",
      "param.rabitQGraph.searcher.max_scan_ratio": "0.1"
    }
  }
}

linear_build_threshold

This parameter is an optional integer. The default value is 0. If the number of documents is less than this threshold, a linear search is performed instead of building a complex index.

// Disable the linear threshold
{
  "linear_build_threshold": 0
}

// Use linear search for small datasets
{
  "linear_build_threshold": 1000
}

// A larger threshold, suitable for staging environments
{
  "linear_build_threshold": 5000
}

Appendix: Complete examples

HNSW

// Basic HNSW configuration
PUT /hnsw_basic
{
  "mappings": {
    "properties": {
      "vector": {
        "type": "dense_vector",
        "dims": 768,
        "index": true,
        "similarity": "cosine",
        "index_options": {
          "type": "havenask_native",
          "knn_type": "HNSW"
        }
      }
    }
  }
}

// High-performance HNSW configuration
PUT /hnsw_performance
{
  "mappings": {
    "properties": {
      "vector": {
        "type": "dense_vector",
        "dims": 1024,
        "index": true,
        "similarity": "dot_product",
        "index_options": {
          "type": "havenask_native",
          "knn_type": "HNSW",
          "m": 48,
          "ef_construction": 500,
          "thread_count": 8,
          "linear_build_threshold": 1000,
          "is_embedding_saved": true,
          "embedding_load_strategy": "ANN_INDEX_FILE",
          "index_load_strategy": "MEM"
        }
      }
    }
  }
}

Linear

// Basic Linear configuration
PUT /linear_basic
{
  "mappings": {
    "properties": {
      "vector": {
        "type": "dense_vector",
        "dims": 384,
        "index": true,
        "similarity": "l2_norm",
        "index_options": {
          "type": "havenask_native",
          "knn_type": "Linear"
        }
      }
    }
  }
}

QC

// Basic QC configuration
PUT /qc_basic
{
  "mappings": {
    "properties": {
      "vector": {
        "type": "dense_vector",
        "dims": 768,
        "index": true,
        "similarity": "cosine",
        "index_options": {
          "type": "havenask_native",
          "knn_type": "QC"
        }
      }
    }
  }
}

// Custom QC configuration
PUT /qc_custom
{
  "mappings": {
    "properties": {
      "vector": {
        "type": "dense_vector",
        "dims": 1024,
        "index": true,
        "similarity": "dot_product",
        "index_options": {
          "type": "havenask_native",
          "knn_type": "QC",
          "thread_count": 8,
          "linear_build_threshold": 5000,
          "index_params": "{\"proxima.qc.builder.thread_count\": 8, \"proxima.qc.builder.centroid_count\": \"2000\"}"
        }
      }
    }
  }
}

QGraph

// Basic QGraph configuration
PUT /qgraph_basic
{
  "mappings": {
    "properties": {
      "vector": {
        "type": "dense_vector",
        "dims": 768,
        "index": true,
        "similarity": "cosine",
        "index_options": {
          "type": "havenask_native",
          "knn_type": "QGraph",
          "quantizer": "int8"
        }
      }
    }
  }
}

// High-precision QGraph configuration
PUT /qgraph_high_precision
{
  "mappings": {
    "properties": {
      "vector": {
        "type": "dense_vector",
        "dims": 1536,
        "index": true,
        "similarity": "cosine",
        "index_options": {
          "type": "havenask_native",
          "knn_type": "QGraph",
          "m": 40,
          "ef_construction": 600,
          "thread_count": 8,
          "quantizer": "fp16",
          "is_embedding_saved": true,
          "index_load_strategy": "MEM"
        }
      }
    }
  }
}

// Memory-optimized QGraph configuration
PUT /qgraph_memory_optimized
{
  "mappings": {
    "properties": {
      "vector": {
        "type": "dense_vector",
        "dims": 1024,
        "index": true,
        "similarity": "l2_norm",
        "index_options": {
          "type": "havenask_native",
          "knn_type": "QGraph",
          "m": 24,
          "ef_construction": 300,
          "thread_count": 6,
          "quantizer": "int4",
          "is_embedding_saved": false,
          "index_load_strategy": "BUFFER"
        }
      }
    }
  }
}

RabitQGraph

// Basic RabitQGraph configuration (only supports l2_norm similarity)
PUT /rabitqgraph_basic
{
  "mappings": {
    "properties": {
      "vector": {
        "type": "dense_vector",
        "dims": 64,
        "index": true,
        "similarity": "l2_norm",
        "index_options": {
          "type": "havenask_native",
          "knn_type": "RabitQGraph"
        }
      }
    }
  }
}

// High-performance RabitQGraph configuration - using index_params in Map format
PUT /rabitqgraph_performance
{
  "mappings": {
    "properties": {
      "vector": {
        "type": "dense_vector",
        "dims": 128,
        "index": true,
        "similarity": "l2_norm",
        "index_options": {
          "type": "havenask_native",
          "knn_type": "RabitQGraph",
          "thread_count": 8,
          "index_params": {
            "param.rabitQGraph.builder.neighbor_cnt": 256,
            "param.rabitQGraph.builder.ef_construction": 512,
            "param.rabitQGraph.builder.quantized_bit_count": 4,
            "param.rabitQGraph.builder.cluster_count": 128,
            "param.rabitQGraph.searcher.ef": 300
          }
        }
      }
    }
  }
}

// Memory-optimized RabitQGraph configuration
PUT /rabitqgraph_memory_optimized
{
  "mappings": {
    "properties": {
      "vector": {
        "type": "dense_vector",
        "dims": 192,
        "index": true,
        "similarity": "l2_norm",
        "index_options": {
          "type": "havenask_native",
          "knn_type": "RabitQGraph",
          "thread_count": 4,
          "linear_build_threshold": 1000,
          "index_params": {
            "param.rabitQGraph.builder.neighbor_cnt": 64,
            "param.rabitQGraph.builder.ef_construction": 200,
            "param.rabitQGraph.builder.quantized_bit_count": 1,
            "param.rabitQGraph.builder.cluster_count": 32,
            "param.rabitQGraph.searcher.ef": 150,
            "param.rabitQGraph.searcher.max_scan_ratio": 0.05
          }
        }
      }
    }
  }
}

// RabitQGraph configuration with tag filtering
PUT /rabitqgraph_with_tags
{
  "mappings": {
    "properties": {
      "vector": {
        "type": "dense_vector",
        "dims": 256,
        "index": true,
        "similarity": "l2_norm",
        "index_options": {
          "type": "havenask_native",
          "knn_type": "RabitQGraph",
          "tags": ["category", "region"],
          "index_params": {
            "param.rabitQGraph.builder.neighbor_cnt": 128,
            "param.rabitQGraph.builder.ef_construction": 400,
            "param.rabitQGraph.builder.quantized_bit_count": 8,
            "param.rabitQGraph.searcher.ef": 250
          }
        }
      },
      "category": {
        "type": "keyword"
      },
      "region": {
        "type": "keyword"
      }
    }
  }
}