All Products
Search
Document Center

OpenSearch:Intervention dictionaries for stop words

Last Updated:Mar 06, 2023

Overview

OpenSearch is built in with a stop word dictionary to filter out words in search queries. You can perform the following steps to intervene in stop word filtering in search queries:

  1. Create an intervention dictionary for stop words. To create an intervention dictionary, log on to the OpenSearch console. In the left-side navigation pane, choose Search Algorithm Center > Retrieval Configuration. On the Basic Configuration page, click Dictionary Management in the left-side pane. On the Dictionary Management page, click Create in the upper-left corner. Specify a name for the intervention dictionary, select a dictionary type, and then click Save. After the intervention dictionary is created, it appears in the dictionary list.

  2. Create and manage intervention entries in the intervention dictionary. To go to the entry management page of the created dictionary, find the dictionary in the dictionary list and click Manage Entries in the Actions column. On this page, create and manage intervention entries as needed. You can intervene in the stop words of search queries in the following ways: -Add a stop word. If a term in a segmented search query is the added stop word, the term is not used for retrieval. -Block a stop word. If a term in a segmented search query is the blocked stop word, the term is used for retrieval.

  3. Use the intervention dictionary. After you create intervention entries in the intervention dictionary, you can use the intervention dictionary in a query analysis rule on an application as needed.

  4. Test and publish the intervention dictionary. After the intervention dictionary is associated with the query analysis rule, we recommend that you perform a search test before you apply the rule to online environments. This ensures expected search performance.

Example

Scenario: You have created query analysis rules with stop word filtering enabled for the OpenSearch application of your e-commerce shopping guide service. After you apply these rules to the online application, the returned search results are unsatisfactory. To resolve the issue, intervention in stop word filtering is implemented.

Unsatisfactory search results: A user enters the search query "Hainan has bananas". Only a few documents that contain the "Hainan has bananas" phrase are retrieved.

Problem description: One reason is that the system fails to recognize the word "has" in the search query as a stop word.

Solution: Create an intervention dictionary and add the word "has" as a stop word. Then, associate the intervention dictionary with a query analysis rule that is used for the online application.

Procedure:

1. Log on to the OpenSearch console. In the left-side navigation pane, choose Search Algorithm Center > Retrieval Configuration. On the Basic Configuration page, click Dictionary Management in the left-side pane. On the Dictionary Management page, click Create in the upper-left corner to create an intervention dictionary.

image

In the Create Query Analysis Dictionary panel, specify a name for the intervention dictionary and set the Dictionary Type parameter to Stop Word.

image2. Find the created intervention dictionary and click Manage Entries in the Actions column. On the page that appears, click Add Intervention Entry. In the Add Intervention Entries panel, enter has in the Stop Word column, select Add in the Intervention Type column, and then click Save.

image

3. Go to the Query Analysis Rule Management page and click Create in the upper-left corner. In the Create Rule panel, associate the created intervention dictionary for stop word with the rule. Do not apply the rule to the online application in this step.

image

4. Perform a search test. When you search for "Hainan has bananas", all documents that contain the phrase "Hainan bananas" are also retrieved.

Usage notes

  • You cannot change the name and type of an intervention dictionary after it is created.

  • You must specify different stop words for different intervention entries.

  • An intervention dictionary can be used by multiple query analysis rules.

  • OpenSearch uses the built-in dictionaries together with the intervention entries that you create. If you enable stop word filtering when you create a query analysis rule, the built-in dictionary for stop words is automatically selected.

  • If an intervention dictionary is used by a query analysis rule, no matter whether the rule is applied to an online application or an offline application, you cannot delete the dictionary. You must first disassociate the dictionary from the rule.

Limits

  • You can create a maximum of 20 intervention dictionaries for stop words within your Alibaba Cloud account.

  • You can specify only one stop word for an intervention entry.

  • You can create a maximum of 500 intervention entries in an intervention dictionary.

  • An intervention entry takes effect only when a term in a segmented search query matches the stop word in the entry. For example, you specify "what" as a stop word in an intervention entry. If a search query is "what facial cream is good", OpenSearch retrieves documents based on the search query "facial cream is good".

  • OpenSearch normalizes the content of intervention entries. All uppercase letters are converted into lowercase letters and all full-width characters are converted into half-width characters.