An application is a set of data configurations, such as data source schema, index schema, and data attributes. An application serves as a search service.
A document is a search unit of structured data. A document can contain one or more fields and must have a primary key field. OpenSearch identifies a unique document based on the value of the primary key field. If a new document has the same primary key value as an existing document, the exiting document is overwritten by the new one.
A field is a component of a document. A field consists of a field name and a field value.
To help you process data during data import, OpenSearch provides various built-in data processing plug-ins. You can choose to use these plug-ins by using the content conversion feature when you define the schema or configure a data source for an application.
The original data to be pushed to OpenSearch. It contains one or more source fields.
A source field is the smallest unit of the source data. A source field consists of a field name and a field value. For more information about supported data types, see Application schema and index schema.
An index is a data structure that is used to accelerate document retrieval. You can create multiple indexes.
You can create a composite index on multiple source fields of the text types such as TEXT and SHORT_TEXT. For example, if you need to create a forum search service that supports both title-based searches and comprehensive searches based on titles and bodies, you can create the title_search index on titles and the default index on both titles and bodies. This way, title-based searches are implemented based on the title_search index. Comprehensive searches based on titles and bodies are implemented based on the default index.
Index fields can be used in query clauses. To implement high-performance data retrieval, you must define index fields.
Attribute fields can be used in the FILTER, SORT, AGGREGATE, and DISTINCT clauses of queries to implement features such as filtering and statistics.
default display field
Default display fields are displayed in search results. You can use fetch_fields, which is an API parameter, to specify the fields to return for each search request. Note that if you set the fetch_fields parameter in your program, the configurations of the default display fields in your application are ignored and the fields that are specified by the fetch_fields parameter are displayed in the search results. If you do not set the fetch_fields parameter in your program, the default display fields in your application are displayed in the search results.
This feature is used to segment the field values in documents that are pushed to OpenSearch. For example, the English - Analysis with Word Root analyzer can segment the "english analyzer" query string into "english" and "analyzer".
A term is a text element generated after analysis.
After analysis, indexes are built based on terms. This allows OpenSearch to locate specific documents based on search requests in a fast manner. Search engines can build two types of linked lists: inverted indexes and forward indexes.
An inverted index is a linked list that maps terms to their locations in a set of documents. Inverted indexes are used in query clauses to improve query efficiency. Example: term1->doc1,doc2,doc3 and term2->doc1,doc2.
A forward index is a linked list that maps documents to fields. Forward indexes are used in FILTER clauses. The efficiency of forward indexes is lower than that of inverted indexes. Example: doc1->id,type,create_time…
After documents are pushed to OpenSearch, the field values in the documents are segmented into individual terms based on query keywords. OpenSearch looks up inverted indexes that are built based on the terms to find matched documents. This process is referred as retrieval.
The number of documents that are retrieved.
The source of data to be pushed. OpenSearch supports automatic integration with major Alibaba Cloud storage services.
This feature reindexes on data. Generally, you need to manually rebuild indexes after you configure a data source for an application for the first time, modify the data source, or modify the application schema. Scheduled reindexing is used to re-import full data to an application. You must enable automatic data synchronization to use this feature.
The cumulative size of total documents of tables in an application. The cumulative size is calculated based on the field values. Each field value is converted to a string to calculate the cumulative size.
The number of queries per second.
Logical computing unit (LCU) is the unit that is used to measure the computing power of a search service. A LCU indicates the computing power of 10 millicores in a search cluster. Millicore is the unit of CPU resources. Each millicore is one thousandth of one core.
A sort expression is a mathematical expression that you can write to control the sorting of search results. You can use basic mathematical operations, mathematical functions and built-in functions to write a sort expression.
rough sort expression
A rough sort expression is used to sort the search results for the first round. In this round, all documents that meet the specified conditions are traversed. Up to 1 million documents can be traversed. In this case, you need to write a simple rough sort expression to sort the search results by key factors. For example, you can sort the search results for news by textuality and timeliness. The system calculates the matching scores of the documents based on a rough sort expression and sorts the documents based on the calculated scores.
fine sort expression
The system selects top N results that are sorted based on a rough sort and calculates the matching scores of the results in a more precise manner by using a fine sort expression. Then, the system sorts the results based on the calculated scores and returns the results to users.
search result summary
Generally, the length of text content is long. To help users understand the main content of a document, only a part of content of a document is displayed in the search results.
OpenSearch allows you to create various query analysis rules, such as spelling check, stop word filtering, and term weight analysis. This allows you to implement a better search intervention and search experience for users.