All Products
Search
Document Center

OpenSearch:Stop word filtering

Last Updated:Mar 09, 2023

Overview

Meaningless words in search queries are filtered out based on the built-in stop word dictionary. Meaningless words are the words that appear at a high frequency but do not affect the search results, such as punctuation marks and modal particles. For example, if you set a search query to Running!Man, the exclamation point (!) is filtered out during data retrieval. For another example, if you set a search query to Did you eat, the modal particle Did is filtered out during data retrieval.

Procedure

1. Log on to the OpenSearch console. In the left-side navigation pane, click Retrieval Configuration. On the Basic Configuration page, click Query Analysis Rule Configuration in the left-side pane. On the Query Analysis Rule Configuration page, select an application and the online or offline version of the application, and click Create.

image

2. In the Create Rule panel, configure the Rule Name, Index Range, and Industry Type parameters, select Stop Word, and then click OK.image

Note: If no intervention dictionary for stop word filtering is specified, stop words are filtered out based on the built-in stop word dictionary. If identified stop words are invalid or specific stop words are not identified based on the built-in stop word dictionary, specify an intervention dictionary. For more information, see the "Intervention dictionaries for stop word filtering" section of this topic.

3. After the rule is created, click Search Test to perform a search test.

imageThe following figure shows the test result.image

View the process of query analysis.image

4. After you confirm that the process of query analysis is correct, click Index Orientation on the Query Analysis Rule Configuration page. Then, set the created query analysis rule as the default query analysis rule.

image

5. Check the default query analysis rule.

image

Intervention dictionaries for stop word filtering

Stop words vary with business scenarios. Specific stop words may not exist in the built-in stop word dictionary, or the built-in stop word dictionary may contain invalid stop words. To resolve the issue, OpenSearch allows you to customize stop words. After you create an intervention dictionary for stop word filtering, you can specify the intervention dictionary when you create or modify a query analysis rule to intervene in stop word filtering. For more information, see Intervention dictionaries for stop word filtering.