OpenSearch provides high search performance by dividing the entire sort process into two phases: rough sort and fine sort. Rough sort is the process of selecting the top N high-quality documents from all documents that are retrieved. Then, the top N high-quality documents are scored and sorted in the fine sort process. This way, users can obtain the documents that best match their requirements. Rough sort affects the search performance, whereas fine sort affects the ultimate sort results. Therefore, simple but efficient rough sort is preferred so that documents are roughly sorted based only on the key factors used for fine sort. Documents are roughly and finely sorted by using sort expressions.
You can customize sort expressions for applications and specify sort expressions in search queries to sort results. Sort expressions support basic operations, mathematical functions, and feature functions. The basic operations include arithmetic operations, relational operations, logical operations, bitwise operations, and conditional operations. OpenSearch provides expression templates for you to perform searches in typical applications, such as forum and news applications. You can select an appropriate expression template based on your data features and modify the selected template to generate a custom expression.
Before you perform fine sort by relevance, make sure that you understand how a sort policy works: After the documents that meet your requirements are found based on your queries, the documents are sorted. For more information, see Sort clause. If you do not specify a sort clause or have specified a rank function in a sort clause, scores are calculated by relevance.
Rough and fine sort expressions can be designed based on your actual search needs. For more information about how to design and arrange sort factors in typical scenarios, see Sort results by relevance.
Note: To perform basic operations such as arithmetic, relational, logical, and conditional operations, you must use numbers or field values of the NUMERIC type in sort expressions. Most function-based operations cannot be performed on values of the STRING type.
The minus sign (-) is used to obtain the negative of the value that is obtained by using a specific expression. Examples: -1 and -max(width).
+, -, *, /
Example: width / 10.
==, !=, >, <, >=, <=
and, or, !
Example: width>=400 and height >= 300, ! (a > 1 and b < 2)
&, |, ^
Example: 3 & (price ^ pubtime) + (price | pubtime)
if(cond, thenValue, elseValue)
The thenValue is returned if the cond is non-zero, and the elseValue is returned if the cond is zero. For example, if(2, 3, 5) returns 3, and if(0, 3, 5) returns 5.Note: The values used to perform conditional operations cannot be of a string type, such as the LITERAL or TEXT type.
i in [value1, value2, …, valuen]
The expression returns 1 if i is contained in the set [value1, value2, …, valuen]. Otherwise, 0 is returned. For example, 2 in [2, 4, 6] returns 1, and 3 in [2, 4, 6] returns 0.
Returns the maximum value in a and b.
Returns the minimum value in a and b.
Returns the natural logarithm of a.
Returns the logarithm of a with the base number of 2.
Returns the logarithm of a with the base number of 10.
Returns the sine of a.
Returns the cosine of a.
Returns the tangent of a.
Returns the arcsine of a.
Returns the arccosine of a.
Returns the arctangent of a.
Returns the smallest integer that is greater than or equal to a. For example, ceil(4.2) returns 5.
Returns the greatest integer that is smaller than or equal to a. For example, floor(4.6) returns 4.
Returns the square root of a. For example, sqrt(4) returns 2.
Returns the result of a raised to the power of b. For example, pow(2, 3) returns 8.
Returns the number of seconds that have elapsed since 00:00:00 January 1, 1970 in UTC.
Returns a random value in [0,1].
Built-in feature functions
OpenSearch provides bountiful built-in feature functions, such as the feature functions of the location-based service (LBS), text, and timeliness types. You can combine feature functions in sort expressions to perform complex relevance-based sorts.
Cava is an efficient programming language that is developed by the OpenSearch engine team based on LLVM. Cava uses the syntax similar to that of Java and can achieve equivalent performance as that of C++. Cava is an object-oriented programming language that supports just-in-time (JIT) compilation and various security checks to ensure a more robust program. You can use the Cava libraries that are provided by Cava and OpenSearch to design a dedicated sort plug-in in OpenSearch. Compared with the expressions provided by OpenSearch, a Cava-based sort plug-in has the following benefits:
More diversified custom designs: Cava allows you to customize a sort plug-in by using more diversified syntax. For example, you can use for loops and define functions and classes based on your business requirements.
Easier to maintain: A Cava-based sort plug-in is more recognizable and easier to maintain.
Easier to learn: Cava uses the syntax similar to that of Java. Users who are familiar with Java can understand and use Cava for development with ease. This reduces learning costs.
Note: Cava-based plug-ins can be used only in exclusive applications.
The following example shows how to configure rough sort and fine sort polices by using a text relevance-based sort function:
1.Create a rough sort policy: Log on to the OpenSearch console. In the left-side navigation pane, choose Search Algorithm Center > Sort Configuration. On the Policy Management page, click Create. On the Create Policy page, specify the policy name and set the Scope parameter to Rough Sort and Type parameter to Expression. Then, click Next.
Select static_bm25 from the Scoring Characteristics drop-down list and set the Weight parameter to 10. A weight of 10 indicates that the score is multiplied by 10.
After the configuration is complete, the Policy Management page appears.
2.Create a fine sort policy: Log on to the OpenSearch console. In the left-side navigation pane, choose Search Algorithm Center > Sort Configuration. On the Policy Management page, click Create. On the Create Policy page, specify the policy name and set the Scope parameter to Fine Sort and the Type parameter to Expression. Then, click Next.
Select text_relevance from the Built-in Functions drop-down list, enter the field name to be queried in parentheses, and then click Completed.
After the configuration is complete, the Policy Management page appears.
3.View sort results: On the Search Test page, set the fields for rough sort and fine sort and turn on Show Sort Details.
The following figure shows the calculated score of each function.
SDK for Java:
// Use the default rough and fine sort expressions. Rank rank =newRank(); rank.setFirstRankName("default");// The name of the rough sort policy. rank.setSecondRankName("default");// The name of the fine sort policy. rank.setReRankSize(5);// The number of documents for fine sort.
SDK for PHP:
// Specify the rough sort expression. $params->setFirstRankName('default'); // Specify the fine sort expression $params->setSecondRankName('default');
Note: The rough and fine sort expressions specified in the code prevail over the default rough and fine sort expressions configured in the OpenSearch console.