You can configure advanced parameters for a vector index when you configure the index schema. This topic describes the advanced configurations of vector indexes.
When you configure an index schema, you can configure advanced parameters for the vector index in the Index Settings section.
Parameter description
Parameter | Valid value | Description |
dimension | An integer that is greater than 1 | The dimension of the vector. |
distance_type |
| The distance type of the vector. |
vector_index_type |
| The algorithm that is used to build the vector index. |
enable_rt_build |
| Specifies whether to enable real-time indexing. A value of true specifies that real-time indexing is enabled. |
rt_index_params | Default value | The parameters for real-time indexing. |
build_index_params | Default value | The parameters that you want to configure for the builder type that you specified for the builder_name parameter. |
search_index_params | Default value | The parameters that you want to configure for the searcher type that you specified for the searcher_name parameter. |
embedding_delimiter | The default value is a comma (,). The value of this parameter can be customized. | The vector delimiter. |
major_order |
| The data storage mode. |
linear_build_threshold | Default value: 5000. | The threshold value for operations that do not use LinearBuilder. If the number of documents is less than the specified threshold value, the system uses LinearBuilder and LinearSearcher. LinearBuilder can help you reduce memory usage and ensures lossless retrieval results. The performance of LinearBuilder is compromised if an excessive number of documents exist. Default value: 10000. |
min_scan_doc_cnt | Default value: 20000. | The minimum number of retrieval candidate sets. Default value: 10000. The concept is similar to that of the proxima.qc.searcher.scan_ratio parameter. If you specify a value for the min_scan_doc_cnt parameter and specify a value for the proxima.qc.searcher.scan_ratio parameter, the larger value is used as the minimum number of candidate sets. |
enable_recall_report | Default value: true. | Specifies whether to enable the feature of reporting the retrieval rate metric. |
is_embedding_saved | Default value: false. | Specifies whether to save the original vector. If you enable INT8 quantization or FP16 quantization and enable real-time retrieval, make sure that you set the is_embedding_saved parameter to true. Otherwise, incremental vectors fail to be built in batches. |
ignore_invalid_doc | Default value: true. | Specifies whether to ignore abnormal vector data. If you set this parameter to true and the vector dimension is incorrect or no vector data is available, the system discards abnormal vector data. |
indexer |
| The plug-in that you want to use to build the vector index. Set this parameter to aitheta2_indexer. |