When you create a table, you can configure advanced configurations for vector indexes in the Index Schema step. This topic describes the parameters for the advanced configurations of vector indexes.
When you create a table, configure the index schema in the Index Schema step.
The following figure shows the parameters for advanced configurations.
The following table describes the parameters.
Parameter | Valid value | Description |
Vector Dimension | N/A | The number of features or attributes of a vector. The Vector Dimension parameter specifies the complexity of the information and features that the vector can represent. You must configure the Vector Dimension parameter based on the vector generated by your vector model. |
Distance Type |
| The smaller the SquareEuclidean score, the more relevence. The larger the InnerProduct score, the more relevence. The Cosine score ranges from [-1, 1]. A value of -1 indicates that the two vectors are in opposite directions, with the lowest similarity. A value of 1 indicates that the two vectors are in the same direction, with the highest similarity. |
Vector Index Algorithm |
| The vector indexing algorithm. For more information, see Introduction to vectors. Note The DiskANN algorithm is supported only when the data node specification family is SSD. |
Real-time Indexing |
| Specifies whether to enable the real-time indexing feature. If you set this parameter to true, the real-time indexing feature is enabled. The OpenSearch Vector Search Edition instance builds indexes for the real-time data that you push by calling API operations. Then, you can query the data in real time. |
Real-time Indexing Parameters | {"proxima.oswg.streamer.segment_size":2048} | The parameters for real-time indexing. We recommend that you use the default value. |
Index Retrieval Parameters | N/A | The parameters for real-time retrieval. You must configure this parameter based on the vector indexing algorithm. For more information, see the following topics: |
Vector Separator | Customizable | The delimiter that is used to separate dimensions during vector retrieval. For example, a comma (,) is used as the delimiter in vector:'1.05066,0.15610,0.156145...'. |
Threshold for Linear Building | Default value: 5000 | The threshold value for operations that do not create indexes in order. A value of 5000 specifies that indexes are created in order if the number of documents is less than 5,000. |
Ignore Invalid Vector Data |
| Specifies whether to ignore invalid vector data. If you set this parameter to true, the system creates indexes for full or batch incremental data as expected when the vector dimension is invalid and the vector data is empty. |