Instance-related terms
Term | Description |
number of replicas | The number of copies of full index data that are redundant for a single table. The copies can be used for data retrieval. |
network information | The information about the access over a virtual private cloud (VPC) or the Internet. |
API endpoint | The API endpoint that is required when you use SDKs to perform operations on an instance. |
query test | The feature that is used to retrieve data from tables in an instance. |
change history | The feature that is used to record the history and progress of various O&M operations that you perform. |
metric monitoring | The feature that is used to display metrics related to Query Result Searcher (QRS) workers and Searcher workers. |
alert management | The feature that is used to configure alert metrics, alert rules, and alert contacts for instance-related metrics. |
Table-related terms
Basic information
Term | Description |
data shard | The number of Searcher workers on which index data is stored. When you adjust the number of data shards, make sure that multiple index tables of the same OpenSearch instance have the same number of shards. Alternatively, make sure that at least one index table has one shard and other index tables have the same number of shards. |
number of resources for data updates | The number of resources used for data updates. By default, OpenSearch provides a free quota of two resources for data updates for each data source in an OpenSearch Vector Search Edition instance. Each resource consists of 4 CPU cores and 8 GB of memory. You are charged for resources that exceed the free quota. |
scenario template | OpenSearch Vector Search Edition provides the following templates to help you configure data:
|
data processing | If you select the Vector: Image Search or Vector: Semantic Search for Text template, a data processing method is required. OpenSearch Vector Search Edition provides the following data processing methods:
|
reindexing | The process of building indexes on full data in a MaxCompute or Object Storage Service (OSS) data source. The indexes that are generated during this process are full indexes, and the index versions are full index versions. |
Data source
Term | Description |
MaxCompute data source | The data source from which full data is obtained. The raw data is stored in MaxCompute by partition. Incremental data can be pushed by using API operations. |
API data source | The data source from which incremental data is obtained. Data is updated by using API operations. |
OSS data source | The data source from which full data is obtained. The raw data is stored in OSS buckets. Incremental data can be pushed by using API operations. |
Field and index
Term | Description |
field | The component of a document. A field consists of a field name and a field value. |
multi-value field | The field that contains multiple independent values. |
primary key | The field that uniquely identifies a document. |
document | The search unit of structured data. A document can contain one or more fields and must have a primary key field. OpenSearch Vector Search Edition identifies a unique document based on the value of the primary key field. If a new document has the same primary key value as an existing document, the existing document is overwritten by the new document. |
field type | The data type of a field. Examples: INTEGER, FLOAT, and STRING. |
vector field | The field in which vectors are stored. The vector field is a multi-value field of the FLOAT type. |
field that requires word embedding | The field that stores the text or images on which you want to perform word embedding. The field is of the STRING or TEXT type. |
multi-value delimiter | By default, multiple values of a field are separated by HA3 delimiters ( |
fields of a vector index | A vector index consists of the following fields:
|
vector dimension | The length of the generated vector array. |
distance type | The method that is used to measure or calculate the distance between two vectors in a vector space. |
vector index algorithm | The algorithm that is used to search for and retrieve a large number of vectors. A common method for vector-based queries is to calculate the distance between two vectors and then sort and retrieve vectors based on the distance. |
real-time index | The index that is built based on real-time vector data. |
regular index | The non-vector index. Example: keyword index. |
O&M-related terms
Term | Description |
reindexing | The process in which the indexes on full data are rebuilt without changes in the data source, field configurations, or index schema. |
stop or resume a table | The feature that is used to enable or disable a table in an instance. |
Data changes triggered and implemented by FSM
Change type | Whether to allow recurring events | Description |
ha3_biz_apend | No | This operation can be performed once on each instance. The system automatically triggers this change. The change may continue to run until the index table is added to the instance and indexes are built. |
update_biz_depend_index_fsm | No | This operation can be performed once on each instance. The system automatically triggers this change. The change may continue to run until indexes are built. |
multi_biz_activate | No | Initializes an OpenSearch Vector Search Edition instance. This operation can be performed once on each instance. The change may continue to run until the index table is added to the instance and indexes are built. |
Automatic triggering for full indexing | Yes | The system automatically triggers this change after new data partitions are identified. The latest change and historical changes can concurrently run. |
Manual triggering for full indexing | Yes | The latest change and historical changes can concurrently run. |
Online resources | Yes | For the same zone, all historical changes are terminated before the latest change runs. |
FSM: the finite-state machine. FSM works as a mathematical model that represents a finite number of states and the switchover between these states.
Whether to allow recurring events: specifies whether to allow recurring events of the same change type.