Implement an E-Commerce Search System Using OpenSearch - OpenSearch

Build a keyword search system for product catalogs that supports multi-attribute search, category filtering, query analysis, and relevance ranking. After completing this tutorial, you will have a working search prototype that accepts keyword queries, filters results by category and price, corrects misspelled queries, and ranks results by relevance.

System architecture of the e-commerce search prototype

Prerequisites

Before you begin, make sure you have:

An Alibaba Cloud account with an AccessKey pair. The OpenSearch application requires an AccessKey pair for authentication.
(Optional) An AccessKey pair for a RAM user, if you plan to delegate access. For details, see Access authorization rules.
(Optional) An ApsaraDB RDS for MySQL instance with product data, if you plan to use a database as your data source

Step 1: Create an application

E-commerce search involves multiple related tables, such as a product table and a pricing table. Select an advanced application that supports joining of multiple tables.

Log on to the OpenSearch console. In the left-side navigation pane, click Instance Management. On the Instance Management page, click Create Instance.
Select the application type. Choose an advanced application that supports joining of multiple tables. For a comparison of application types, see Comparison between standard applications and advanced applications.

Configure the application parameters. Application parameter configuration

Parameter	Description
Product Type	Billing method: subscription or pay-as-you-go. For details, see Billing methods.
Region and Zone	China: Shenzhen, Qingdao, Beijing, Zhangjiakou, Hangzhou, Shanghai, and Hong Kong. Asia Pacific: Singapore. Europe & Americas: Germany (Frankfurt) and US (Virginia).
Application Name	Accepts digits, letters, and underscores (_). Must start with a letter. Maximum 30 characters. Cannot be changed after creation.
Application Type	Advanced application or standard application.
Cluster Preferences	Specification types: shared general-purpose, shared computing, shared storage, exclusive general-purpose, exclusive computing, and exclusive storage. For details, see What is OpenSearch?
Storage Capacity and Computing Resources	Set quotas based on your workload. The number of logical computing units (LCUs) equals the number of queries per second (QPS) multiplied by the LCUs consumed per query. To view per-query LCU consumption, purchase a shared general-purpose instance and run a search test.

Step 2: Define the application schema

OpenSearch offers four ways to define your application schema:

Method	Best for
Manually define an application schema	Full control over every field and type
Use a template to define an application schema	Quick setup with a predefined or custom template
Upload a file to define an application schema	Bootstrapping from an existing JSON data file. OpenSearch parses the file and generates an initial schema. Redefine field types after generation.
Use a data source to define an application schema	Syncing data from ApsaraDB RDS, MaxCompute, or PolarDB. The source table schema generates the initial application schema, reducing manual work and errors.

If you use Alibaba Cloud storage services such as MaxCompute, ApsaraDB RDS, or PolarDB, specify them as data sources in the OpenSearch console for automatic data synchronization. The following steps use an ApsaraDB RDS data source as an example. For more information, see Configure an ApsaraDB RDS for MySQL data source.

Connect to the data source

Enter the database connection details.

Select the data source

Choose the tables to import.

Configure the table schema

This example uses two tables:

Primary table: the commodity table (product catalog)
Secondary table: the commodity price table

The primary key ID of the commodity price table is associated with the foreign key ID of the commodity table.

Primary and secondary table schema configuration

Define the index schema

Add all searchable fields from the commodity table and commodity price table to an index list named "default". This enables queries such as query=default:"keyword".

Note

The analyzer affects search results. Choose the analysis method carefully. For details, see Built-in analyzers.

Enable data synchronization

Configure automatic data synchronization so that updates in the data source are automatically pushed to OpenSearch.

Finalize the application

Click Completed. On the Application Details page, the application status shows that initialization is in progress.

Step 3: Upload data

When you use an ApsaraDB RDS data source, full data import starts automatically during index creation. Monitor the import progress on the Application Details page.

Alternatively, upload data through OpenSearch APIs or SDKs. For details, see API overview and SDK overview.

Step 4: Test search queries

After data upload completes, run a search test. The OpenSearch console provides a built-in search test page. For programmatic access, use the OpenSearch APIs or SDKs.

For search syntax details, see Initiate search requests and query clause.

Step 5: Configure query analysis

Query analysis processes queries before retrieval to improve search quality. Long-tail queries may return few results, and queries with spelling errors or Chinese Pinyin may return no results. Query analysis addresses these issues. For more information, see Query analysis and Perform searches based on relevance.

OpenSearch provides the following query analysis features:

Feature	What it does	Example
Stop word	Filters out meaningless words such as punctuation and modal particles.	Query "Running Man!" -- the exclamation mark is filtered out.
Spelling correction	Corrects definite spelling errors automatically. For possible errors, the original query is used.	"Alipapa" is corrected to "Alibaba".
Word weight	Evaluates term importance and assigns weights. Low-importance terms may be excluded from retrieval.	Query "OpenSearch is good or not" -- retrieves documents containing "OpenSearch".
Synonym	Expands queries with synonyms from the built-in synonym library and semantic models. Can be combined with word weight for better results.	"KFC" also retrieves "Kentucky Fried Chicken".
Named entity recognition (NER)	Identifies semantic entities in queries and assigns category priorities.	"Nike Slim Dress" -- "Nike" (brand, medium priority), "Slim" (style, low priority), "Dress" (category, high priority).

Example: Set up spelling correction

The following steps show how to configure a spelling correction rule.

Create an intervention dictionary

In the OpenSearch console, go to Search Algorithm Center > Retrieval Configuration. Click Dictionary Management.
Click Create. In the Create Query Analysis Dictionary panel, enter a dictionary name, set Dictionary Type to Spelling Correction, and click Save.
In the dictionary list, find your dictionary and click Manage Entries.
Click Add Intervention Entry to create an entry.
Click Save. The entry appears in the intervention entry list.

Create a query analysis rule

Go to Search Algorithm Center > Retrieval Configuration. Click Query Analysis Rule Configuration.
Click Create. In the Create Rule panel, set Intervention Dictionaries to the spelling correction dictionary you created (for example, dic_error).

Test and apply the rule

On the Query Analysis Rule Configuration page, click Search Test in the Actions column to verify the correction behavior.
After verification, click Index Orientation and set the rule as the default query analysis rule.

Step 6: Configure sort expressions

Sort expressions control how search results are ranked. You can specify sort expressions in the query clause to sort results. OpenSearch uses a two-phase ranking approach: rough sort (first-phase ranking) narrows down candidates, and fine sort (second-phase ranking) determines final order.

For details, see Configure sort expressions.

Create a rough sort expression

In the OpenSearch console, go to Search Algorithm Center > Sort Configuration to open the Policy Management page.
Click Create to add a rough sort expression. Select representative fields such as text score and timeliness score. Rough sort significantly affects search performance, so choose fields carefully.

Create a fine sort expression

Add a fine sort expression for text relevance scoring.

Compare sort results

On the Search Test page, compare the results of a standard query with a query that applies the sort expression.

Step 7: Add drop-down suggestions and category prediction

Drop-down suggestions

Drop-down suggestions help users find queries faster as they type, reducing input effort in e-commerce search. For setup instructions, see Drop-down suggestions.

Category prediction

Category prediction identifies the product category that a search query most likely targets. For setup instructions, see Category prediction.

Step 8: Control result diversity and filtering

Distinct clauses

When multiple products from a single vendor score highly, they may dominate the results page. Distinct clauses enforce diversity so that results include products from multiple vendors. For details, see Distinct clauses.

Filter clauses

Filter clauses let users narrow results by attribute ranges, such as price. For details, see Filter clauses.

The following Java code shows how to filter by price range:

if(!lowPrice.equals("")){
  queryElement.addFilter("price>=" + lowPrice);
}
if(!highPrice.equals("")){
  queryElement.addFilter("price<=" + highPrice);
}

What you built

After completing these steps, your e-commerce search prototype supports:

Multi-attribute keyword search across joined tables
Query analysis with spelling correction, stop words, synonyms, and named entity recognition
Two-phase relevance ranking with rough sort and fine sort expressions
Drop-down suggestions and category prediction
Result diversity with distinct clauses and price filtering with filter clauses

OpenSearch provides search APIs and SDKs so you can integrate these capabilities into your application without building and maintaining a custom search infrastructure.

Next steps

API overview -- integrate search into your application
SDK overview -- use language-specific SDKs
Query analysis -- explore additional query processing features
Configure sort expressions -- fine-tune result ranking