This document describes the advanced usage of FalconSeek for vector retrieval. It explains how to select algorithms, such as HNSW and RabitQGraph, to accelerate queries for specific needs, including cost-effective storage for large-scale data and extreme performance optimization. This guide covers algorithm selection, parameter settings, index management, and dynamic tuning techniques, and provides complete code examples to help you implement efficient vector retrieval.
Background information
FalconSeek enhances the original index structure of Alibaba Cloud ES by adding a new C++-based vector engine index. The FalconSeek vector index is developed by Alibaba and supports major services within Alibaba Group, such as Taobao, Tmall Search, Recommendation, and Pailitao (search by image). You can use the high-performance vector index integrated into the FalconSeek kernel to build efficient AI applications, such as applications for search by image and semantic search.
The FalconSeek vector index is fully compatible with open source ES vector engine features, such as k-Nearest Neighbors (k-NN) search. To use FalconSeek in an Alibaba Cloud ES index, you can set index_options.type to havenask_native in the index configuration. For more information about advanced parameter settings, see the detailed descriptions that follow.
Usage examples
Basic example
The following example shows how to create an index to store and retrieve vector data. The index is named my_falcon_seek_index. It contains a vector field named product_vector with 128 dimensions and uses the HNSW algorithm.
Create an index.
PUT /my_falcon_seek_index { "settings": { "number_of_shards": 1, "number_of_replicas": 0 }, "mappings": { "properties": { "product_vector": { "type": "dense_vector", "dims": 128, "index": true, "similarity": "l2_norm", "index_options": { "type": "havenask_native", "knn_type": "HNSW", "m": 32, "ef_construction": 400 } }, "category": { "type": "keyword" } } } }Core parameters
The
dense_vectorfield type is used to store dense vector data. When you define a field of this type inmappings, you must specify the following core properties:dims: The dimension of the vector.similarity: The function to calculate the similarity between vectors.index_options: The detailed configuration for the vector index, including the algorithm type and related parameters.
The
similarityfunction is used to measure the degree of similarity between two vectors. Choosing the right function is critical for recall performance.Function
Description
Scenarios
l2_norm
Euclidean distance. This function calculates the straight-line distance between two vectors in a multi-dimensional space. A smaller distance indicates greater similarity.
General scenarios, such as image recognition and facial recognition.
cosine
Cosine similarity. This function calculates the cosine of the angle between two vector directions. A value closer to 1 indicates greater similarity.
Text semantic similarity analysis. This function is not affected by vector length.
dot_product
Inner product. This function calculates the dot product of two vectors. A larger value indicates greater similarity.
Suitable for scenarios that need to consider vector magnitude, such as recommendation systems.
max_inner_productSame as dot_product, but does not require vector normalization.
Suitable for scenarios that need to consider vector magnitude, such as recommendation systems.
Write data. You can write documents that contain vectors and metadata, such as product categories, to the index. The length of the
product_vectorarray must be the same as the dimension defined bydims(128).POST /my_falcon_seek_index/_doc/1 { "product_vector": [0.12, -0.05, 0.08, 0.24, -0.17, 0.31, 0.02, -0.19, 0.11, 0.28, -0.03, 0.15, 0.22, -0.11, 0.09, 0.33, -0.07, 0.14, 0.26, -0.21, 0.18, 0.29, -0.13, 0.06, 0.35, -0.08, 0.16, 0.23, -0.15, 0.12, 0.27, -0.22, 0.19, 0.32, -0.14, 0.07, 0.25, -0.18, 0.13, 0.30, -0.09, 0.17, 0.24, -0.16, 0.10, 0.34, -0.10, 0.20, 0.31, -0.23, 0.15, 0.28, -0.12, 0.11, 0.26, -0.19, 0.14, 0.29, -0.17, 0.08, 0.22, -0.20, 0.16, 0.27, -0.15, 0.09, 0.25, -0.21, 0.18, 0.30, -0.13, 0.07, 0.24, -0.22, 0.19, 0.32, -0.16, 0.10, 0.26, -0.18, 0.12, 0.28, -0.14, 0.06, 0.23, -0.19, 0.15, 0.29, -0.11, 0.05, 0.21, -0.17, 0.13, 0.27, -0.10, 0.04, 0.20, -0.15, 0.11, 0.25, -0.09, 0.03, 0.19, -0.13, 0.10, 0.24, -0.08, 0.02, 0.18, -0.12, 0.09, 0.23, -0.07, 0.01, 0.17, -0.11, 0.08, 0.22, 0.05, -0.06, 0.09, 0.23, -0.07, 0.01, 0.17, -0.11, 0.08, 0.22], "category": "clothes" } POST /my_falcon_seek_index/_doc/2 {"product_vector":[0.12, -0.05, 0.08, 0.24, -0.17, 0.31, 0.02, -0.19, 0.11, 0.28, -0.03, 0.15, 0.22, -0.11, 0.09, 0.33, -0.07, 0.14, 0.26, -0.21, 0.18, 0.29, -0.13, 0.06, 0.35, -0.08, 0.16, 0.23, -0.15, 0.12, 0.27, -0.22, 0.19, 0.32, -0.14, 0.07, 0.25, -0.18, 0.13, 0.30, -0.09, 0.17, 0.24, -0.16, 0.10, 0.34, -0.10, 0.20, 0.31, -0.23, 0.15, 0.28, -0.12, 0.11, 0.26, -0.19, 0.14, 0.29, -0.17, 0.08, 0.22, -0.20, 0.16, 0.27, -0.15, 0.09, 0.25, -0.21, 0.18, 0.30, -0.13, 0.07, 0.24, -0.22, 0.19, 0.32, -0.16, 0.10, 0.26, -0.18, 0.12, 0.28, -0.14, 0.06, 0.23, -0.19, 0.15, 0.29, -0.11, 0.05, 0.21, -0.17, 0.13, 0.27, -0.10, 0.04, 0.20, -0.15, 0.11, 0.25, -0.09, 0.03, 0.19, -0.13, 0.10, 0.24, -0.08, 0.02, 0.18, -0.12, 0.09, 0.23, -0.07, 0.01, 0.17, -0.11, 0.08, 0.22, 0.05, -0.06, 0.09, 0.23, -0.07, 0.01, 0.17, -0.11, 0.08, 0.22],"category": "clothes"}Vector retrieval (k-NN). You can find the 5 most similar documents based on a given query vector.
GET /my_falcon_seek_index/_search { "knn": { "field": "product_vector", "query_vector": [0.12, -0.05, 0.01, 0.24, -0.17, 0.31, 0.02, -0.19, 0.11, 0.28, -0.03, 0.15, 0.22, -0.11, 0.09, 0.23, -0.07, 0.14, 0.26, -0.21, 0.18, 0.29, -0.13, 0.06, 0.35, -0.18, 0.16, 0.23, -0.15, 0.12, 0.27, -0.22, 0.19, 0.32, -0.14, 0.87, 0.25, -0.18, 0.13, 0.30, -0.09, 0.17, 0.24, -0.16, 0.10, 0.64, -0.10, 0.20, 0.31, -0.23, 0.15, 0.28, -0.12, 0.11, 0.26, -0.19, 0.14, 0.29, -0.17, 0.08, 0.22, -0.20, 0.16, 0.27, -0.15, 0.09, 0.25, -0.21, 0.18, 0.30, -0.13, 0.07, 0.24, -0.22, 0.19, 0.52, -0.16, 0.10, 0.26, -0.18, 0.12, 0.28, -0.14, 0.06, 0.23, -0.19, 0.14, 0.29, -0.11, 0.05, 0.21, -0.17, 0.13, 0.27, -0.10, 0.04, 0.20, -0.15, 0.11, 0.25, -0.09, 0.03, 0.19, -0.13, 0.10, 0.24, -0.08, 0.02, 0.18, -0.12, 0.09, 0.23, -0.07, 0.01, 0.17, -0.11, 0.08, 0.22, 0.05, -0.06, 0.09, 0.23, -0.07, 0.01, 0.17, -0.11, 0.08, 0.12], "k": 5, "num_candidates": 100 } }Core parameters
The
knnquery is used to perform a k-NN search.field: The name of thedense_vectorfield to query.query_vector: The vector used for the query. Its dimension must match the field definition.k: The number of most similar results to return.num_candidates: The size of the candidate set that the algorithm searches internally on each shard. This value must be greater thankand is usually a multiple ofk. A larger value increases the recall rate but also increases query latency.
Filtered retrieval. You can filter the results during vector retrieval to meet more complex business requirements. The following example finds the 5 most similar documents in the "shoes"
category.GET /my_falcon_seek_index/_search { "knn": { "field": "product_vector", "query_vector": [ 0.12, -0.05, 0.01, 0.24, -0.17, 0.31, 0.02, -0.19, 0.11, 0.28, -0.03, 0.15, 0.22, -0.11, 0.09, 0.23, -0.07, 0.14, 0.26, -0.21, 0.18, 0.29, -0.13, 0.06, 0.35, -0.18, 0.16, 0.23, -0.15, 0.12, 0.27, -0.22, 0.19, 0.32, -0.14, 0.87, 0.25, -0.18, 0.13, 0.30, -0.09, 0.17, 0.24, -0.16, 0.10, 0.64, -0.10, 0.20, 0.31, -0.23, 0.15, 0.28, -0.12, 0.11, 0.26, -0.19, 0.14, 0.29, -0.17, 0.08, 0.22, -0.20, 0.16, 0.27, -0.15, 0.09, 0.25, -0.21, 0.18, 0.30, -0.13, 0.07, 0.24, -0.22, 0.19, 0.52, -0.16, 0.10, 0.26, -0.18, 0.12, 0.28, -0.14, 0.06, 0.23, -0.19, 0.14, 0.29, -0.11, 0.05, 0.21, -0.17, 0.13, 0.27, -0.10, 0.04, 0.20, -0.15, 0.11, 0.25, -0.09, 0.03, 0.19, -0.13, 0.10, 0.24, -0.08, 0.02, 0.18, -0.12, 0.09, 0.23, -0.07, 0.01, 0.17, -0.11, 0.08, 0.22, 0.05, -0.06, 0.09, 0.23, -0.07, 0.01, 0.17, -0.11, 0.08, 0.12 ], "k": 5, "num_candidates": 100, "filter": { "term": { "category": "shoes" } } } }
Extended feature: tags_filter filtered retrieval
The tags_filter parameter provides a pre-filtering mechanism that is optimized for the HNSW and QGraph algorithms. It offers higher performance than the standard bool filter because it excludes non-matching nodes early in the graph traversal process.
Scenarios: You can use this feature when the filtering field has a limited number of unique values, such as a category or brand ID, and you require high query performance.
How to enable this feature:
In the
index_optionsof themappings, you can declare thekeywordtype field to be used for filtering with thetagsparameter.PUT /my_vector_index_with_tags { "mappings": { "properties": { "product_vector": { "type": "dense_vector", "dims": 128, "index_options": { "type": "havenask_native", "knn_type": "HNSW", "tags": ["category"] // Declare the category field for tags_filter } }, "category": { "type": "keyword" // Must be of keyword type } } } }In the
knnquery, you can use thetags_filterparameter with the"field_name = value"syntax. This parameter supports|(OR) and&(AND) logic.GET /my_vector_index_with_tags/_search { "knn": { "field": "product_vector", "query_vector": [...], "k": 5, "tags_filter": "category = shoes | category = socks" } }
Algorithm selection and performance guide
Choosing the right knn_type algorithm is critical for achieving the best balance of performance, cost, and precision.
Algorithm comparison and recommendations
Algorithm | Recall rate | Query speed | Memory usage | Scenarios and recommendations | Core limitations |
HNSW | High | Fast | Medium | General-purpose first choice. Suitable for most scenarios, providing the best balance between recall rate, performance, and resource consumption. It performs well with datasets ranging from 100,000 to 10 million documents. | No special limitations. |
RabitQGraph | High | Fastest | Lowest | For extreme performance and cost-sensitive scenarios. Suitable for datasets that require millisecond-level query latency or are extremely sensitive to memory costs, especially for very large datasets (over 10 million documents). | Only supports |
QGraph | High | Fast | Low | For large-scale, storage cost-sensitive scenarios. Significantly reduces memory and storage usage through vector quantization. Suitable for datasets with over 5 million documents. | No special limitations. |
QC | Medium | Medium | Low | For large-scale, memory-constrained scenarios. Consider this when build time is not sensitive, but runtime memory resources are very tight. Suitable for datasets with over 1 million documents. | Longer build time. |
Linear | Highest (100%) | Slow | Low | For small datasets or scenarios requiring absolute precision. Suitable for scenarios where the total number of documents is less than | Only suitable for small datasets. |
Scenario-based selection path
If you are just starting or are unsure which algorithm to use, choose
HNSW.If the data volume grows and performance degrades:
If you have sufficient memory and want to achieve higher Queries Per Second (QPS), you can migrate from
HNSWtoRabitQGraphif the requirements for `RabitQGraph` are met.If memory or storage costs are a concern, you can migrate from
HNSWtoQGraphand configure aquantizer.
For very large datasets with over 10 million documents:
If you have extremely high latency requirements, choose
RabitQGraph.If you are cost-sensitive and have slightly lower latency requirements, choose
QGraph.
Index management (Mappings)
When you create an index, all vector-related configurations are set within the index_options block of the mappings.
PUT /<your_index_name>
{
"mappings": {
"properties": {
"<your_vector_field>": {
"type": "dense_vector",
"dims": 768,
"similarity": "l2_norm",
"index_options": {
// --- General parameters ---
"type": "havenask_native",
"knn_type": "HNSW",
// --- Algorithm build parameters ---
"m": 32,
"ef_construction": 400,
// --- Advanced parameters ---
"thread_count": 8
}
}
}
}
}General parameters
Parameter | Description | Type | Required | Default |
| Specifies the FalconSeek vector index engine. Must be set to | String | Yes | None |
| Specifies the vector index algorithm. Options are | String | No |
|
Algorithm build parameters
The following parameters take effect during index construction and determine the structure and quality of the index.
Common parameters for HNSW and QGraph
Parameter | Description | Type | Default | Recommended values and impact |
| The maximum number of neighbors for each node in the graph. | Integer |
| Impact: Directly affects recall rate and memory usage. |
| The width of the candidate set search during graph construction. | Integer |
| Impact: Determines the quality and time of index construction. |
QGraph-specific parameters
Parameter | Description | Type | Default | Recommended values and impact |
| The vector quantization method, used to compress vectors to reduce memory and storage usage. | String | None | Impact: Significantly reduces resource usage, but may cause a slight loss of precision. |
Advanced parameters
Parameter | Description | Type | Default | Recommended values and impact |
| The number of parallel threads used during index construction. | Integer |
| Impact: Speeds up the build process. |
| Declares a list of fields for high-performance filtering with | Array |
| Impact: Enables high-performance pre-filtering. |
| When the total number of documents in the index is below this threshold, the system automatically switches to | Integer |
| Impact: Avoids building complex index structures for small datasets. |
| A configuration interface for the underlying engine's parameters, used for advanced tuning. | Object |
| Impact: Allows fine-grained control over every detail of the build and query processes. |
The index_params parameter provides direct access to the underlying proxima.* and param.* parameters. For example, the top-level m parameter corresponds to the underlying proxima.hnsw.builder.max_neighbor_count. You should use this configuration only when you need to adjust details that are not exposed by the top-level parameters.
json
"index_options": { "knn_type": "HNSW", "m": 32, // This will be overwritten by index_params below "index_params": { "proxima.hnsw.builder.max_neighbor_count": 48 // The final effective value is 48 }}The knn query body
Parameter | Description | Type | Required |
| The name of the | String | Yes |
| The vector used for the query. | Array | Yes |
| The number of most similar results to return. | Integer | Yes |
| The size of the search candidate set on each shard. A larger value increases the recall rate but slows down the query. | Integer | No (Recommended) |
| (Optional) High-performance pre-filtering for | String | No |
| (Optional) Dynamically adjust some search parameters at query time for temporary tuning. | Object | No |
Parameter description
HNSW
The parameters for the Hierarchical Navigable Small Worlds (HNSW) algorithm use the proxima.hnsw. namespace and can be set using index_params.
HNSW Builder
Parameter name | Type | Default | Description |
proxima.hnsw.builder.max_neighbor_count | uint32 | 100 | The number of neighbors in the graph. A larger value makes the graph more accurate but increases computation and storage overhead. It generally should not exceed the feature dimension. The maximum is 65535. |
proxima.hnsw.builder.efconstruction | uint32 | 500 | Controls the precision of graph construction. A larger value results in a more accurate graph but takes longer to build. |
proxima.hnsw.builder.thread_count | uint32 | 0 | The number of threads to use during construction. If set to 0, it uses the number of CPU cores. |
proxima.hnsw.builder.memory_quota | uint64 | 0 | Limits the maximum memory for construction. Disk-based construction is not currently supported. If this value is exceeded, the build will fail. |
proxima.hnsw.builder.scaling_factor | uint32 | 50 | The ratio of nodes between graph layers. Generally does not need to be modified. The value range is [5,1000]. |
proxima.hnsw.builder.neighbor_prune_ratio | float | 0.5 | Controls the number of neighbors at which edge pruning begins in the neighbor table. Generally does not need to be modified. |
proxima.hnsw.builder.upper_neighbor_ratio | float | 0.5 | The ratio of neighbors in the upper layer of the graph relative to the layer 0 graph. Generally does not need to be modified. |
proxima.hnsw.builder.enable_adsampling | bool | false | Disabled by default. Currently only supports Euclidean distance calculation for fp32 datasets. Not recommended for dimensions below 256. |
proxima.hnsw.builder.slack_pruning_factor | float | 1.0 | Default is 1.0. Recommended to be between [1.1, 1.2]. For gist960 and sift128, 1.1 is recommended. |
HNSW Searcher
Parameter name | Type | Default | Description |
proxima.hnsw.searcher.ef | uint32 | 500 | Used to control retrieval precision. A larger value scans more documents and increases the recall rate. |
proxima.hnsw.searcher.max_scan_ratio | float | 0.1 | Controls the maximum proportion of documents to scan during retrieval. If the ef value converges early, this ratio may not be reached. |
proxima.hnsw.searcher.neighbors_in_memory_enable | bool | false | If enabled, keeps the neighbor table in memory, which improves performance but consumes more memory. |
proxima.hnsw.searcher.check_crc_enable | bool | false | Specifies whether to perform a cyclic redundancy check (CRC) on the index. Enabling this will increase load time. |
proxima.hnsw.searcher.visit_bloomfilter_enable | bool | false | Uses a bloom filter as the container for deduplicating visited graph nodes. This optimizes memory but slightly degrades performance. |
proxima.hnsw.searcher.visit_bloomfilter_negative_prob | float | 0.001 | The accuracy of the bloom filter. A smaller value is more accurate but uses more memory. |
proxima.hnsw.searcher.brute_force_threshold | int | 1000 | If the total number of documents is less than this value, a linear search is performed. |
HNSW configuration examples
// Basic HNSW configuration
PUT /hnsw_basic
{
"mappings": {
"properties": {
"vector": {
"type": "dense_vector",
"dims": 768,
"index": true,
"similarity": "cosine",
"index_options": {
"type": "havenask_native",
"knn_type": "HNSW"
}
}
}
}
}
// High-performance HNSW configuration
PUT /hnsw_performance
{
"mappings": {
"properties": {
"vector": {
"type": "dense_vector",
"dims": 1024,
"index": true,
"similarity": "dot_product",
"index_options": {
"type": "havenask_native",
"knn_type": "HNSW",
"m": 48,
"ef_construction": 500,
"thread_count": 8,
"linear_build_threshold": 1000,
"is_embedding_saved": true,
"embedding_load_strategy": "ANN_INDEX_FILE",
"index_load_strategy": "MEM"
}
}
}
}
}
RabitQGraph
RabitQGraph Builder
Parameter name | Type | Default | Description |
param.rabitQGraph.builder.neighbor_cnt | uint32 | 128 | The number of neighbors for each node, affecting graph connectivity and search precision. |
param.rabitQGraph.builder.ef_construction | uint32 | 512 | The EF parameter for building, which controls the number of candidate nodes during construction. |
param.rabitQGraph.builder.prune_ratio | float | 0.5 | Neighbor pruning ratio, used to optimize the graph structure. |
param.rabitQGraph.builder.cluster_count | uint32 | 64 | The number of cluster centroids, used for vector quantization. |
param.rabitQGraph.builder.quantized_bit_count | uint32 | 1 | The number of bits for quantization. Can only be set to 1, 4, 5, 8, or 9. |
param.rabitQGraph.builder.slack_prune_factor | float | 1.0 | Slack pruning factor, used to control the pruning policy. |
param.rabitQGraph.builder.repair_connectivity | bool | true | Specifies whether to repair graph connectivity. |
param.rabitQGraph.builder.thread_count | uint32 | 0 | The number of threads used during construction. If set to 0, it uses the number of CPU cores. |
param.rabitQGraph.builder.ckpt_count | uint32 | 0 | The number of checkpoints, used for incremental builds. |
param.rabitQGraph.builder.ckpt_threshold | uint32 | 2000000 | Checkpoint threshold. |
RabitQGraph Searcher
Parameter name | Type | Default | Description |
param.rabitQGraph.searcher.ef | uint32 | 250 | The EF parameter for searching, which affects search precision and performance. |
param.rabitQGraph.searcher.max_scan_ratio | double | 0.05 | Maximum scan ratio, which limits the percentage of nodes to search. |
param.rabitQGraph.searcher.check_crc_enable | bool | false | Specifies whether to enable CRC check. |
param.rabitQGraph.searcher.thread_count | uint32 | 1 | The number of threads to use for searching. |
param.rabitQGraph.searcher.thread_safe_filter | bool | false | Specifies whether to enable thread-safe filtering. |
RabitQGraph configuration examples
// High-performance configuration
{
"index_params": {
"param.rabitQGraph.builder.neighbor_cnt": 256,
"param.rabitQGraph.builder.ef_construction": 512,
"param.rabitQGraph.builder.quantized_bit_count": 8,
"param.rabitQGraph.builder.cluster_count": 128,
"param.rabitQGraph.builder.thread_count": 8,
"param.rabitQGraph.searcher.ef": 300,
"param.rabitQGraph.searcher.max_scan_ratio": 0.1
}
}
// Memory-optimized configuration
{
"index_params": {
"param.rabitQGraph.builder.neighbor_cnt": 64,
"param.rabitQGraph.builder.ef_construction": 200,
"param.rabitQGraph.builder.quantized_bit_count": 1,
"param.rabitQGraph.builder.cluster_count": 32,
"param.rabitQGraph.searcher.ef": 150,
"param.rabitQGraph.searcher.max_scan_ratio": 0.03
}
}
// Balanced configuration
{
"index_params": {
"param.rabitQGraph.builder.neighbor_cnt": 128,
"param.rabitQGraph.builder.ef_construction": 400,
"param.rabitQGraph.builder.quantized_bit_count": 4,
"param.rabitQGraph.builder.cluster_count": 64,
"param.rabitQGraph.searcher.ef": 250
}
}
Linear
The Linear algorithm performs a linear brute-force search. Its parameters are relatively simple and use the proxima.linear. namespace.
Linear Builder/Searcher
Parameter name | Type | Default | Description |
proxima.linear.builder.column_major_order | string | false | Specifies whether to use row-major (false) or column-major (true) order for features during construction. |
proxima.linear.searcher.read_block_size | uint32 | 1048576 | The size of the block to read into memory at one time during the search phase. The recommended value is 1 MB. |
Linear configuration examples
// Basic configuration
{
"index_params": {
"proxima.linear.builder.column_major_order": "false",
"proxima.linear.searcher.read_block_size": 1048576
}
}
// Large memory configuration
{
"index_params": {
"proxima.linear.builder.column_major_order": "true",
"proxima.linear.searcher.read_block_size": 2097152
}
}
QC
The QC (Quantization Clustering) algorithm uses a quantization clustering index. Its parameters use the proxima.qc. namespace.
QC Builder
Parameter name | Type | Default | Description |
proxima.qc.builder.train_sample_count | uint32 | 0 | Specifies the amount of training data. If 0, all data is used. |
proxima.qc.builder.thread_count | uint32 | 0 | The number of threads to use during construction. If set to 0, it uses the number of CPU cores. |
proxima.qc.builder.centroid_count | string | - | Cluster centroid parameter. Supports hierarchical clustering. Separate layers with "*". |
proxima.qc.builder.cluster_class | string | OptKmeansCluster | Specifies the clustering method. |
proxima.qc.builder.cluster_auto_tuning | bool | false | Specifies whether to enable automatic tuning of the number of centroids. |
proxima.qc.builder.optimizer_class | string | HcBuilder | The optimizer for the centroid part, used to improve precision during classification. |
proxima.qc.builder.optimizer_params | IndexParams | - | The build and retrieval parameters corresponding to the optimize method. |
proxima.qc.builder.converter_class | string | - | If Measure is InnerProduct, an Mips transformation is automatically performed. |
proxima.qc.builder.converter_params | IndexParams | - | Initialization parameters for converter_class. |
proxima.qc.builder.quantizer_class | string | - | Configures the quantizer. Options include Int8QuantizerConverter, Int4QuantizerConverter, etc. |
proxima.qc.builder.quantizer_params | IndexParams | - | Parameters related to the quantizer. |
proxima.qc.builder.quantize_by_centroid | bool | false | When using quantizer_class, specifies whether to quantize by centroid. |
proxima.qc.builder.store_original_features | bool | false | Specifies whether to store the original features. |
QC Searcher
Parameter name | Type | Default | Description |
proxima.qc.searcher.scan_ratio | float | 0.01 | Used to calculate max_scan_num: total doc count * scan_ratio. |
proxima.qc.searcher.optimizer_params | IndexParams | - | Specifies the online retrieval parameters corresponding to the optimizer used during the build. |
proxima.qc.searcher.brute_force_threshold | int | 1000 | If the total number of documents is less than this value, a linear search is performed. |
QC configuration examples
// Basic configuration
{
"index_params": {
"proxima.qc.builder.thread_count": 4,
"proxima.qc.builder.centroid_count": "1000",
"proxima.qc.builder.cluster_class": "OptKmeansCluster",
"proxima.qc.searcher.scan_ratio": 0.02
}
}
// Hierarchical clustering configuration
{
"index_params": {
"proxima.qc.builder.thread_count": 8,
"proxima.qc.builder.centroid_count": "100*100",
"proxima.qc.builder.optimizer_class": "HnswBuilder",
"proxima.qc.builder.quantizer_class": "Int8QuantizerConverter",
"proxima.qc.searcher.scan_ratio": 0.01
}
}
// High-precision configuration
{
"index_params": {
"proxima.qc.builder.thread_count": 12,
"proxima.qc.builder.centroid_count": "2000",
"proxima.qc.builder.train_sample_count": 100000,
"proxima.qc.builder.store_original_features": true,
"proxima.qc.searcher.scan_ratio": 0.05
}
}
QGraph
The QGraph (Quantized Graph) algorithm uses a quantized graph index. It inherits most of the parameters from HNSW and adds quantization-related parameters.
QGraph Builder
QGraph inherits all HNSW Builder parameters and adds the following:
Parameter name | Type | Default | Description |
proxima.qgraph.builder.quantizer_class | string | - | Configures the quantizer. Options include Int8QuantizerConverter, Int4QuantizerConverter, HalfFloatConverter, and DoubleBitConverter. |
proxima.qgraph.builder.quantizer_params | IndexParams | - | Configures parameters related to the quantizer. |
All proxima.hnsw.builder.* parameters also apply to QGraph.
QGraph Searcher
QGraph inherits all HNSW Searcher parameters.
QGraph configuration examples
// Int8 quantization configuration
{
"index_params": {
"proxima.hnsw.builder.max_neighbor_count": 32,
"proxima.hnsw.builder.efconstruction": 400,
"proxima.hnsw.builder.thread_count": 4,
"proxima.qgraph.builder.quantizer_class": "Int8QuantizerConverter",
"proxima.qgraph.builder.quantizer_params": {},
"proxima.hnsw.searcher.ef": 300
}
}
// Int4 quantization configuration (more memory-efficient)
{
"index_params": {
"proxima.hnsw.builder.max_neighbor_count": 48,
"proxima.hnsw.builder.efconstruction": 500,
"proxima.hnsw.builder.thread_count": 6,
"proxima.qgraph.builder.quantizer_class": "Int4QuantizerConverter",
"proxima.qgraph.builder.quantizer_params": {},
"proxima.hnsw.searcher.ef": 400,
"proxima.hnsw.searcher.max_scan_ratio": 0.1
}
}
// HalfFloat quantization configuration (high precision)
{
"index_params": {
"proxima.hnsw.builder.max_neighbor_count": 64,
"proxima.hnsw.builder.efconstruction": 600,
"proxima.hnsw.builder.thread_count": 8,
"proxima.qgraph.builder.quantizer_class": "HalfFloatConverter",
"proxima.qgraph.builder.quantizer_params": {},
"proxima.hnsw.searcher.ef": 500
}
}
Dynamic parameter adjustment with search_params
You can temporarily adjust certain search parameters without rebuilding the index to balance the recall rate and query latency in different scenarios.
For example, for normal online queries, you can use the default or a lower ef value to ensure low latency. For batch analytics or tasks that require high precision, you can temporarily increase the ef value using search_params to obtain a higher recall rate.
HNSW
{
"search_params": {
"proxima.hnsw.searcher.ef": "400",
"proxima.hnsw.searcher.max_scan_ratio": "0.15"
}
}
QGraph
{
"search_params": {
"proxima.hnsw.searcher.ef": "400",
"proxima.hnsw.searcher.max_scan_ratio": "0.15"
}
}
QC
{
"search_params": {
"proxima.qc.searcher.scan_ratio": "0.02"
}
}
RabitQGraph
{
"search_params": {
"param.rabitQGraph.searcher.ef": "300",
"param.rabitQGraph.searcher.max_scan_ratio": "0.08"
}
}
search_params configuration examples
// Dynamically adjust precision and performance for HNSW
GET vector_index/_search
{
"knn": {
"field": "vector",
"query_vector": [0.1, 0.2, 0.3],
"k": 10,
"num_candidates": 100,
"search_params": {
"proxima.hnsw.searcher.ef": "500",
"proxima.hnsw.searcher.max_scan_ratio": "0.2"
}
}
}
// Dynamically adjust scan ratio for QC
GET vector_index/_search
{
"knn": {
"field": "vector",
"query_vector": [0.1, 0.2, 0.3],
"k": 10,
"num_candidates": 100,
"search_params": {
"proxima.qc.searcher.scan_ratio": "0.05"
}
}
}
// Dynamically adjust search precision for RabitQGraph
GET vector_index/_search
{
"knn": {
"field": "vector",
"query_vector": [0.1, 0.2, 0.3],
"k": 10,
"num_candidates": 100,
"search_params": {
"param.rabitQGraph.searcher.ef": "400",
"param.rabitQGraph.searcher.max_scan_ratio": "0.1"
}
}
}
linear_build_threshold
This parameter is an optional integer. The default value is 0. If the number of documents is less than this threshold, a linear search is performed instead of building a complex index.
// Disable the linear threshold
{
"linear_build_threshold": 0
}
// Use linear search for small datasets
{
"linear_build_threshold": 1000
}
// A larger threshold, suitable for staging environments
{
"linear_build_threshold": 5000
}
Appendix: Complete examples
HNSW
// Basic HNSW configuration
PUT /hnsw_basic
{
"mappings": {
"properties": {
"vector": {
"type": "dense_vector",
"dims": 768,
"index": true,
"similarity": "cosine",
"index_options": {
"type": "havenask_native",
"knn_type": "HNSW"
}
}
}
}
}
// High-performance HNSW configuration
PUT /hnsw_performance
{
"mappings": {
"properties": {
"vector": {
"type": "dense_vector",
"dims": 1024,
"index": true,
"similarity": "dot_product",
"index_options": {
"type": "havenask_native",
"knn_type": "HNSW",
"m": 48,
"ef_construction": 500,
"thread_count": 8,
"linear_build_threshold": 1000,
"is_embedding_saved": true,
"embedding_load_strategy": "ANN_INDEX_FILE",
"index_load_strategy": "MEM"
}
}
}
}
}
Linear
// Basic Linear configuration
PUT /linear_basic
{
"mappings": {
"properties": {
"vector": {
"type": "dense_vector",
"dims": 384,
"index": true,
"similarity": "l2_norm",
"index_options": {
"type": "havenask_native",
"knn_type": "Linear"
}
}
}
}
}
QC
// Basic QC configuration
PUT /qc_basic
{
"mappings": {
"properties": {
"vector": {
"type": "dense_vector",
"dims": 768,
"index": true,
"similarity": "cosine",
"index_options": {
"type": "havenask_native",
"knn_type": "QC"
}
}
}
}
}
// Custom QC configuration
PUT /qc_custom
{
"mappings": {
"properties": {
"vector": {
"type": "dense_vector",
"dims": 1024,
"index": true,
"similarity": "dot_product",
"index_options": {
"type": "havenask_native",
"knn_type": "QC",
"thread_count": 8,
"linear_build_threshold": 5000,
"index_params": "{\"proxima.qc.builder.thread_count\": 8, \"proxima.qc.builder.centroid_count\": \"2000\"}"
}
}
}
}
}
QGraph
// Basic QGraph configuration
PUT /qgraph_basic
{
"mappings": {
"properties": {
"vector": {
"type": "dense_vector",
"dims": 768,
"index": true,
"similarity": "cosine",
"index_options": {
"type": "havenask_native",
"knn_type": "QGraph",
"quantizer": "int8"
}
}
}
}
}
// High-precision QGraph configuration
PUT /qgraph_high_precision
{
"mappings": {
"properties": {
"vector": {
"type": "dense_vector",
"dims": 1536,
"index": true,
"similarity": "cosine",
"index_options": {
"type": "havenask_native",
"knn_type": "QGraph",
"m": 40,
"ef_construction": 600,
"thread_count": 8,
"quantizer": "fp16",
"is_embedding_saved": true,
"index_load_strategy": "MEM"
}
}
}
}
}
// Memory-optimized QGraph configuration
PUT /qgraph_memory_optimized
{
"mappings": {
"properties": {
"vector": {
"type": "dense_vector",
"dims": 1024,
"index": true,
"similarity": "l2_norm",
"index_options": {
"type": "havenask_native",
"knn_type": "QGraph",
"m": 24,
"ef_construction": 300,
"thread_count": 6,
"quantizer": "int4",
"is_embedding_saved": false,
"index_load_strategy": "BUFFER"
}
}
}
}
}
RabitQGraph
// Basic RabitQGraph configuration (only supports l2_norm similarity)
PUT /rabitqgraph_basic
{
"mappings": {
"properties": {
"vector": {
"type": "dense_vector",
"dims": 64,
"index": true,
"similarity": "l2_norm",
"index_options": {
"type": "havenask_native",
"knn_type": "RabitQGraph"
}
}
}
}
}
// High-performance RabitQGraph configuration - using index_params in Map format
PUT /rabitqgraph_performance
{
"mappings": {
"properties": {
"vector": {
"type": "dense_vector",
"dims": 128,
"index": true,
"similarity": "l2_norm",
"index_options": {
"type": "havenask_native",
"knn_type": "RabitQGraph",
"thread_count": 8,
"index_params": {
"param.rabitQGraph.builder.neighbor_cnt": 256,
"param.rabitQGraph.builder.ef_construction": 512,
"param.rabitQGraph.builder.quantized_bit_count": 4,
"param.rabitQGraph.builder.cluster_count": 128,
"param.rabitQGraph.searcher.ef": 300
}
}
}
}
}
}
// Memory-optimized RabitQGraph configuration
PUT /rabitqgraph_memory_optimized
{
"mappings": {
"properties": {
"vector": {
"type": "dense_vector",
"dims": 192,
"index": true,
"similarity": "l2_norm",
"index_options": {
"type": "havenask_native",
"knn_type": "RabitQGraph",
"thread_count": 4,
"linear_build_threshold": 1000,
"index_params": {
"param.rabitQGraph.builder.neighbor_cnt": 64,
"param.rabitQGraph.builder.ef_construction": 200,
"param.rabitQGraph.builder.quantized_bit_count": 1,
"param.rabitQGraph.builder.cluster_count": 32,
"param.rabitQGraph.searcher.ef": 150,
"param.rabitQGraph.searcher.max_scan_ratio": 0.05
}
}
}
}
}
}
// RabitQGraph configuration with tag filtering
PUT /rabitqgraph_with_tags
{
"mappings": {
"properties": {
"vector": {
"type": "dense_vector",
"dims": 256,
"index": true,
"similarity": "l2_norm",
"index_options": {
"type": "havenask_native",
"knn_type": "RabitQGraph",
"tags": ["category", "region"],
"index_params": {
"param.rabitQGraph.builder.neighbor_cnt": 128,
"param.rabitQGraph.builder.ef_construction": 400,
"param.rabitQGraph.builder.quantized_bit_count": 8,
"param.rabitQGraph.searcher.ef": 250
}
}
},
"category": {
"type": "keyword"
},
"region": {
"type": "keyword"
}
}
}
}