How do I use OpenSearch to build a simple prototype for commodity searches in E-commerce scenarios? - OpenSearch

This topic describes how to use OpenSearch to build a simple prototype for commodity searches in E-commerce scenarios to meet the business requirements. When you build an E-commerce platform, an important business requirement is to use keywords to search for different commodity attributes and filter the retrieved commodities based on their categories. OpenSearch can help you build such a commodity search prototype that meets the project requirements.

Preparations

When you create an Alibaba Cloud account and log on to the console for the first time, you are prompted to create an AccessKey pair before you can continue.

You must specify the AccessKey pair within your Alibaba Cloud account because an OpenSearch application is created and used based on the AccessKey pair.
After you create an AccessKey pair within your Alibaba Cloud account, you can create an AccessKey pair for a RAM user so that you can access the application as the RAM user. For more information about how to grant permissions to RAM users, see Access authorization rules.

Architecture

The following figure shows the system architecture of the entire prototype.

Create an application

1. Log on to the OpenSearch console. In the left-side navigation pane, click Instance Management. On the Instance Management page, click Create Instance in the upper-left corner.

2. Select the application type.

Multi-table mapping is involved in E-commerce scenarios. Therefore, select an advanced application that supports joining of multiple tables. For more information about application types, see Comparison between standard applications and advanced applications.

3. Set the application parameters.

Product Type: OpenSearch supports the subscription and pay-as-you-go billing methods. For more information, see Billing methods. Region and Zone: OpenSearch supports the following regions and zones:

China: China (Shenzhen), China (Qingdao), China (Beijing), China (Zhangjiakou), China (Hangzhou), China (Shanghai), and China (Hong Kong)
Asia Pacific: Singapore
Europe & Americas: Germany (Frankfurt) and US (Virginia)

Application Name: The application name can contain digits, letters, and underscores (_). The name must start with a letter and can be up to 30 characters in length. You cannot modify the name after the application is created. Application Type: You can select an advanced application or a standard application. Cluster Preferences: OpenSearch supports the following types of specifications: shared general-purpose, shared computing, shared storage, exclusive general-purpose, exclusive computing, and exclusive storage. For more information, see What is OpenSearch? Storage Capacity and Computing Resources: Specify the quotas for storage capacity and computing resources based on your needs. The number of logical computing units (LCUs) is equal to the number of queries per second (QPS) multiplied by the number of LCUs that are consumed in each query. You can purchase a shared general-purpose application instance and perform a search test to view the number of LCUs that are consumed in each query.

4. Configure the application.

Manually define an application schema: You can customize the application schema to create an application.
Use a template to define an application schema: OpenSearch provides multiple commonly used templates. You can also create a template based on your custom application schema and then use the template to create an application with ease.
Upload a file to define an application schema: You can upload a data file to the OpenSearch console. Then, OpenSearch automatically resolves the uploaded data file and generates an initial application schema. The data file must be in the JSON format. After the initial application schema is generated, you must redefine specific attributes such as field types.
Use a data source to define an application schema: You can use this method if you want to synchronize data from data sources such as ApsaraDB RDS, MaxCompute, and PolarDB data sources. The schema of a source table can be used to generate an initial application schema. This reduces workloads on manual definition and decreases the error probability. The steps of connecting to different data sources are similar. The following figures show how to connect to an ApsaraDB RDS data source. For more information, see Configure an ApsaraDB RDS for MySQL data source.

If you are using Alibaba Cloud storage services, such as MaxCompute, ApsaraDB RDS, or PolarDB, you can specify them as data sources in the OpenSearch console. This way, data is automatically synchronized to OpenSearch in a simple, convenient, and reliable way. The following example shows you how to use an ApsaraDB RDS data source to create an application schema.

5. Connect to a data source.

Enter the database information, as shown in the following figure.

6. Select a data source.

7. Define the application schema.

In this example, the application schema is created based on a commodity table and a commodity price table. The commodity table is used as the primary table, and the commodity price table is used as a secondary table. The primary key ID of the commodity price table is associated with the foreign key ID of the commodity table. The following figures show the primary and secondary tables that are used to create the application schema of the prototype. 7.1 8. Define the index schema.

Add the fields that are used for searches in the commodity table and commodity price table to the index list named "default". This way, you can use query=default:"keyword" to search for commodities. The following figure shows the details of the index schema.

Note

Note: The analysis method affects the search results. Proceed with caution when you select the analysis method. For more information, see Built-in analyzers.

9. Configure a data source.

In this step, you can specify whether to enable automatic data synchronization. After you enable automatic data synchronization, data updates in the data source are automatically synchronized to OpenSearch.

10. After the configuration is complete, click Completed. On the Application Details page, you can find that the application is being initialized.

Upload data

In the preceding example, an ApsaraDB RDS data source is used. In this case, full data starts to be imported by default when indexes are created. You can view the data import progress on the Application Details page. Alternatively, you can use OpenSearch APIs or SDKs to manually upload data.

Test

After data is uploaded, you can start a search. The OpenSearch console provides built-in search services. You can use APIs or SDKs to perform searches. Alternatively, you can perform searches on the search test page. For more information, see API overview and SDK overview. The following figure shows how to perform searches on the search test page. For more information about the search syntax, see Initiate search requests and query clause. The following figure shows the search results.

You can also use the custom features provided by OpenSearch to obtain a better search experience. A long-tail search query may lead to few retrieval results. A search query that contains spelling errors or Chinese Pinyin may lead to no retrieval results. In such cases, you can use the custom features to resolve the issues. For more information, see Query analysis and Perform searches based on relevance.

Configure a query analysis rule: The following example describes how to configure a query analysis rule by using the spelling correction feature.

Step 1: Create an intervention dictionary for query analysis. Step 1.1: Log on to the OpenSearch console. In the left-side navigation pane, choose Search Algorithm Center > Retrieval Configuration. On the Retrieval Configuration page, click Dictionary Management in the left-side pane to go to the Dictionary Management page.

Step 1.2: Click Create in the upper-right corner. In the Create Query Analysis Dictionary panel, specify the dictionary name, set the Dictionary Type parameter to Spelling Correction, and then click Save.Step 1.3: In the dictionary list of the Dictionary Management page, find the dictionary that you created and click Manage Entries in the Actions column to go to the Manage Entries page.Step 1.4: Click Add Intervention Entry to create an intervention entry.Step 1.5: Click Save. The intervention entry is added. You can view the intervention entry that you created in the intervention entry list.Step 2: In the left-side navigation pane of the OpenSearch console, choose Search Algorithm Center > Retrieval Configuration. On the Retrieval Configuration page, click Query Analysis Rule Configuration in the left-side pane to go to the Query Analysis Rule Configuration page.Step 3: Click Create in the upper-right corner to add a rule that is not published. In the Create Rule panel, set the Intervention Dictionaries parameter to the dic_error dictionary that you created.

Stop Word: This feature filters out meaningless words in search queries based on the built-in stop word dictionary. Meaningless words are the words that appear at a high frequency but do not affect the search results, such as punctuation and modal particles. For example, the search query is "Running Man!". After stop word filtering, the exclamation point (!) is filtered out and is not involved in the retrieval process.
Spelling Correction: This feature corrects the spelling errors that are contained in a search query and provides correction suggestions. If the original search query contains definite spelling errors, OpenSearch corrects these errors and retrieves documents based on the corrected search query. If the original search query contains possible spelling errors, OpenSearch retrieves documents based on the original search query. For example, OpenSearch corrects the spelling error in the search query "Alipapa" and uses the corrected search query "Alibaba" to retrieve documents.
Word Weight: This feature evaluates the importance of each term in search queries and quantifies the evaluated importance as a weight. OpenSearch may not use low-importance terms to retrieve documents. For example, the search query is "OpenSearch is good or not". After term weight analysis, the documents that contain "OpenSearch" can be retrieved.
Synonym: This feature adds synonyms for terms in search queries based on the common synonym library and semantic models that are provided by OpenSearch. This increases the number of retrieval results. For example, the search query is "KFC". After synonym expansion, the documents that contain "Kentucky Fried Chicken" or "KFC" are retrieved. This feature can be combined with the term weight analysis feature to achieve better performance.
Entity Recognition: The named entity recognition (NER) feature of OpenSearch recognizes each semantic entity in a search query based on requirements after the search query is analyzed. Each semantic entity is attached to a specific category. The semantic entity categories with low priorities may be ignored in the search process, whereas the semantic entity categories with high priorities may affect the training of the category predication model. For example, the search query is "Nike Slim Dress". After NER, "Nike" is recognized as a brand name with the medium priority, "Slim" a style element with the low priority, and "Dress" a category name with the high priority.

Step 4: After the rule is created, click Search Test in the Actions column of the Query Analysis Rule Configuration page to verify the search effect.Step 5: After you confirm that the process of query analysis is correct, click Index Orientation on the Query Analysis Rule Configuration page. Then, set the created query analysis rule as the default query analysis rule.

Configure sort expressions: Sort expressions allow you to use custom methods to sort search results for an application. You can specify expressions in the query clause to sort results. For more information, see Configure sort expressions.

Step 1: Log on to the OpenSearch console. In the left-side navigation pane, choose Search Algorithm Center > Sort Configuration to go to the Policy Management page.

Step 2: Click Create in the upper-right corner to add a rough sort expression.

A rough sort greatly affects the search performance. Therefore, we recommend that you select representative fields in the Sort Configuration step. The preceding figures show how to configure expressions for calculating the text score and the timeliness score that indicates how new a document is.

Step 3: Add a fine sort expression.The preceding figures show how to configure an expression for calculating the text relevance score.

Step 4: Complete the configurations. The following figure shows the results of a search test. On the Search Test page, you can compare the search results of a common query and a query that uses a sort expression.

Drop-down suggestions: OpenSearch provides the drop-down suggestion feature to help you find the desired query. This saves your effort in entering search queries in E-commerce scenarios. For more information, see Drop-down suggestions.

Category prediction: OpenSearch provides the category prediction feature to predict the category in which the search query that you entered falls. For more information, see Category prediction.

Other common configurations:

In e-commerce scenarios, multiple commodities of a specific vendor may be highly scored and displayed in the front in the search result list. This affects the display effect of the search results and the user experience. To resolve this issue, you can use distinct clauses so that a diversity of search results are displayed. For more information, see Distinct clauses.
For business scenarios in which you want to view search results based on price ranges, filter clauses can be used. For more information, see Filter clauses. The following sample code shows how to use filter clauses based on the price field:
```
if(!lowPrice.equals("")){
  queryElement.addFilter("price>=" + lowPrice);
}
if(!highPrice.equals("")){
  queryElement.addFilter("price<=" + highPrice);
}
```

Conclusion

After you complete the preceding steps, a simple prototype that is used for commodity searches in E-commerce scenarios is built based on OpenSearch. OpenSearch provides comprehensive search services and APIs for you to search data based on business requirements with ease. This greatly reduces the development workload and makes the development of search features easier. In addition, this reduces the workloads of and costs in system deployment and maintenance because you do not need to build complex search engine platforms. You can use custom configurations and features of OpenSearch based on your business scenarios. This improves user experience in data searches.