All Products
Search
Document Center

OpenSearch:Terms

Last Updated:Oct 17, 2024

Instance-related terms

Term

Description

number of replicas

The number of copies of full index data that are redundant for a single table. The copies can be used for data retrieval.

network information

The information about the access over a virtual private cloud (VPC) or the Internet.

API endpoint

The API endpoint that is required when you use SDKs to perform operations on an instance.

query test

The feature that is used to retrieve data from tables in an instance.

change history

The feature that is used to record the history and progress of various O&M operations that you perform.

metric monitoring

The feature that is used to display metrics related to Query Result Searcher (QRS) workers and Searcher workers.

alert management

The feature that is used to configure alert metrics, alert rules, and alert contacts for instance-related metrics.

Table-related terms

Basic information

Term

Description

data shard

The number of Searcher workers on which index data is stored.

When you adjust the number of data shards, make sure that multiple index tables of the same OpenSearch instance have the same number of shards. Alternatively, make sure that at least one index table has one shard and other index tables have the same number of shards.

number of resources for data updates

The number of resources used for data updates. By default, OpenSearch provides a free quota of two resources for data updates for each data source in an OpenSearch Vector Search Edition instance. Each resource consists of 4 CPU cores and 8 GB of memory. You are charged for resources that exceed the free quota.

scenario template

OpenSearch Vector Search Edition provides the following templates to help you configure data:

  • Common Template: a template that contains no preset fields or indexes. You can use this template to create text indexes and vector indexes.

  • Vector: Image Search: You can use this template to search for other images based on text or images. Relevant fields and indexes are preset in the template to simplify the configuration steps.

  • Vector: Semantic Search for Text: You can use this template in scenarios such as word embedding, semantic analysis and understanding, and conversational searches. Relevant fields and indexes are preset in the template to simplify the configuration steps.

data processing

If you select the Vector: Image Search or Vector: Semantic Search for Text template, a data processing method is required. OpenSearch Vector Search Edition provides the following data processing methods:

  • Has Vector Data: You have a vector model to generate vectors in advance. You can directly use OpenSearch Vector Search Edition instances to perform vector-based queries.

  • Convert Raw Data to Vector Data: You do not have vector models. Before you perform vector-based queries, you need to use the engine of OpenSearch to generate vectors from text or images.

reindexing

The process of building indexes on full data in a MaxCompute or Object Storage Service (OSS) data source. The indexes that are generated during this process are full indexes, and the index versions are full index versions.

Data source

Term

Description

MaxCompute data source

The data source from which full data is obtained. The raw data is stored in MaxCompute by partition. Incremental data can be pushed by using API operations.

API data source

The data source from which incremental data is obtained. Data is updated by using API operations.

OSS data source

The data source from which full data is obtained. The raw data is stored in OSS buckets. Incremental data can be pushed by using API operations.

Field and index

Term

Description

field

The component of a document. A field consists of a field name and a field value.

multi-value field

The field that contains multiple independent values.

primary key

The field that uniquely identifies a document.

document

The search unit of structured data. A document can contain one or more fields and must have a primary key field. OpenSearch Vector Search Edition identifies a unique document based on the value of the primary key field. If a new document has the same primary key value as an existing document, the existing document is overwritten by the new document.

field type

The data type of a field. Examples: INTEGER, FLOAT, and STRING.

vector field

The field in which vectors are stored. The vector field is a multi-value field of the FLOAT type.

field that requires word embedding

The field that stores the text or images on which you want to perform word embedding. The field is of the STRING or TEXT type.

multi-value delimiter

By default, multiple values of a field are separated by HA3 delimiters (^]). This delimiter is encoded as \x1D in the UTF format. You can also use custom delimiters to separate values.

fields of a vector index

A vector index consists of the following fields:

  • Primary key field: the unique primary key field in the field configurations.

  • Namespace field: optional. This field is used to classify or filter vectors during vector-based queries.

  • Vector field: the unique vector field in the field configurations.

vector dimension

The length of the generated vector array.

distance type

The method that is used to measure or calculate the distance between two vectors in a vector space.

vector index algorithm

The algorithm that is used to search for and retrieve a large number of vectors. A common method for vector-based queries is to calculate the distance between two vectors and then sort and retrieve vectors based on the distance.

real-time index

The index that is built based on real-time vector data.

regular index

The non-vector index. Example: keyword index.

O&M-related terms

Term

Description

reindexing

The process in which the indexes on full data are rebuilt without changes in the data source, field configurations, or index schema.

stop or resume a table

The feature that is used to enable or disable a table in an instance.

Data changes triggered and implemented by FSM

Change type

Whether to allow recurring events

Description

ha3_biz_apend

No

This operation can be performed once on each instance. The system automatically triggers this change. The change may continue to run until the index table is added to the instance and indexes are built.

update_biz_depend_index_fsm

No

This operation can be performed once on each instance. The system automatically triggers this change. The change may continue to run until indexes are built.

multi_biz_activate

No

Initializes an OpenSearch Vector Search Edition instance.

This operation can be performed once on each instance. The change may continue to run until the index table is added to the instance and indexes are built.

Automatic triggering for full indexing

Yes

The system automatically triggers this change after new data partitions are identified. The latest change and historical changes can concurrently run.

Manual triggering for full indexing

Yes

The latest change and historical changes can concurrently run.

Online resources

Yes

For the same zone, all historical changes are terminated before the latest change runs.

  • FSM: the finite-state machine. FSM works as a mathematical model that represents a finite number of states and the switchover between these states.

  • Whether to allow recurring events: specifies whether to allow recurring events of the same change type.