Document Center

All Products

Document Center

OpenSearch:Terms

Last Updated:Feb 27, 2024

Data-related terms

Term	Description
MaxCompute data source	The data source from which full data is obtained. The raw data is stored in MaxCompute by partition.
API data source	The data source from which incremental data is obtained. Data is updated by calling API operations.
document	The search unit of structured data. A document can contain one or more fields and must have a primary key field. Retrieval Engine Edition identifies a unique document based on the value of the primary key field. If a new document has the same primary key value as an existing document, the existing document is overwritten by the new document.
field	The component of a document. A field consists of a field name and a field value.
multi-value field	The field that contains multiple independent values.
primary key	The field that uniquely identifies a document.

Retrieval Engine Edition

Term	Description
Query Result Searcher (QRS) worker	The role used in online search. QRS workers parse query requests and merge the results returned by Searcher workers.
Searcher worker	The role used in online search. Searcher workers load index data and provide search services.
cluster	A search service that consists of a set of QRS workers and Searcher workers.
Processor	A role used in offline indexing for parsing users' raw data.
Builder	A role used in offline indexing for indexing on raw data.
Merger	A role used in offline indexing for merging and sorting indexes.
full indexing	The process that is used for indexing on full data in a MaxCompute data source. The indexes that are generated during this process are full indexes, and the index versions are full index versions.
incremental indexing	When data is updated in real time, the offline indexing process generates and applies the indexes to online clusters.
real-time indexing	The data that is pushed by calling API operations takes effect in real time. This process is referred to as real-time indexing. Real-time indexes are generated in the memory of Searcher workers.
inverted index	An inverted index is a linked list that maps terms to their locations in a set of documents. Inverted indexes are used in query clauses to make queries efficient. Example: term1->doc1,doc2,doc3；term2->doc1,doc2.
forward index	A forward index is a linked list that maps documents to fields. Forward indexes are used in FILTER clauses. Forward indexes are less efficient than inverted indexes. Example: doc1->id,type,create_time…
summary index	A summary index collects and stores the information that you want the system to display in summaries of search results. You can query information that is contained in a search result summary by specifying the primary key or document ID. Retrieval Engine Edition displays the search results by page.
tokenization	The sentences in documents are tokenized to terms. If the data type of the field is TEXT, the system tokenizes the sentences into meaningful terms. For example, if the data type is TEXT, "浙江大学" is tokenized into two terms "浙江" and "大学".
term	A term is a token or a set of tokens after tokenization.

Data changes triggered and implemented by FSM

Change type	Whether to allow recurring events	Description
Service discovery	Yes	Points the IP address of a Retrieval Engine Edition instance to the domain name to help you call the service. For the same cluster, all historical changes are terminated before the latest change runs.
ha3_biz_apend	No	Adds biz. This operation can be performed once on each instance. The system automatically triggers this change. The change may continue to run until the index table is added to the instance and the index is built.
update_biz_depend_index_fsm	No	Updates the index on which biz depends. This operation can be performed once on each instance. The system automatically triggers this change. The change may continue to run until the index table is added to the instance and the index is built.
Online deployment	Yes	For the same cluster, all historical changes are terminated before the latest change runs.
multi_biz_activate	No	Initializes a Retrieval Engine Edition instance. This operation can be performed once on each instance. The change may continue to run until the index table is added to the instance and the index is built.
Index creation	Yes	For the same index, all historical changes are terminated before the latest change runs.
Automatically triggered full indexing	Yes	The system automatically triggers this change after new data partitions are identified. The latest change and historical changes can concurrently run.
Manually triggered full indexing	Yes	The latest change and historical changes can concurrently run.
Configuration push	Yes	All historical changes are terminated before the latest change runs.
Online resources	Yes	For the same zone, all historical changes are terminated before the latest change runs.
Index rollback	Yes	The latest change and historical changes can concurrently run.

Note

FSM: the finite-state machine. FSM works as a mathematical model that represents a finite number of states and the switchover between these states.
Trigger recurring events: specifies whether to allow recurring events.