All Products
Search
Document Center

OpenSearch:Custom synonym models for retrieval

Last Updated:Jun 21, 2026

Overview

Before retrieving documents, OpenSearch analyzes and processes the semantics of a user's search query. The synonym feature expands the search query with equivalent terms to retrieve a broader set of relevant documents. Because business scenarios vary, only an application-specific synonym model can ensure optimal search performance.

OpenSearch provides a rich set of domain-specific synonym models. You can use the corresponding industry-specific analyzers to easily train a dedicated custom synonym model. The model automatically extracts and adapts to your existing data. You can also import personalized synonym data to optimize the training model. For more information, contact technical support.

You are charged for tailored retrieval models based on storage capacity, compute cost, and model training. For pricing details, see Billing of OpenSearch Industry Algorithm Edition.

Quick start

To create and use a custom synonym model, follow these three steps:

  1. Create and train the custom synonym model.

  2. Configure query analysis to reference the trained model.

  3. Adjust the model as needed by using an intervention dictionary.

Create and train a model

  1. Log on to the OpenSearch console and navigate to OpenSearch Industry Algorithm Edition > Search Algorithm Center > Retrieval Configuration > Tailored Retrieval Models > Create.

  2. Enter a Model Name, select Synonym Model, choose the Training Fields, and then click Submit.

    In the Target Application drop-down list, select an application. The Model Name must be 1 to 30 characters long, start with a letter, and contain uppercase and lowercase letters, digits, and underscores (_).

Note
  • The model name cannot be changed after creation.

  • Training fields must be of the short_text or text type.

  1. You can click Train to train the synonym model, or click Query Analysis Configuration to use the model in query analysis. Click Finish to close the page.

  2. When created, a model has a Model Status of Unavailable and a Latest Version Status of Pending Training. On the Tailored Retrieval Models page, find the model and click Train in the Actions column.

Note
  • Model training time varies with the data volume and typically takes several hours.

  • Each time training is complete, the system adds a new, incrementally numbered version to the Training History section on the model details page.

  1. After a synonym model is trained, its Model Status changes to Available, and its Latest Version Status changes to Trained and Ready.

Configure query analysis

When you configure a query analysis rule, select the Synonym checkbox. In the configuration panel that appears, for Dictionaries, select Custom Model. Select the Custom Synonym Model that you created (for example, my_test), a Custom Synonym Model Version (for example, 2), and an optional Intervention Dictionary (for example, syn). After you complete the configuration, click Submit.

Note
  • If you have not created a query analysis rule, create one first. For more information, see Query analysis.

  • For more information about how to create an intervention dictionary for synonyms and add entries, see Synonym intervention dictionaries.

Model details

Tailored retrieval models

This page lists your tailored retrieval models and allows you to manage them.

The model list includes various model types, such as Synonym Model, Weighting Model, and Text Analyzer Model. The Actions column provides options such as Details and Train. You can use the Create button at the top of the page to create a custom model.

Note
  • The model list includes the following columns: Model Name, Model Type, Model Status, Last Training Start Time, Latest Version Status, and Actions.

  • You cannot delete a tailored retrieval model that is referenced by a rule.

Synonym model details

The details page includes three sections: Basic Information, Configuration Information, and Training History.

1. After a custom synonym model is trained, you can click Details and then click View under Referenced Rules to see the query analysis rules that reference the current synonym model.

In the left-side navigation pane, choose Search Algorithm Center > Retrieval Configuration, and then click the Tailored Retrieval Models tab. Find the target synonym model and click Details in the Actions column. The Basic Information section displays the creation time, last training start time, Model Status (for example, "Available"), and Latest Version Status (for example, "Trained and Ready"). The Configuration Information section displays the training fields (for example, title). The Training History table includes columns for the model version, version status, training start time, training end time, referenced rules, and actions (including deletion).

2. Test the model performance

In the Training History table, find the target model version and click Performance Test in its Actions column. On the performance test page, enter a search query in the Test Text box to view the corresponding Synonym Expansion Results.

3. View Performance Comparison

The Model Difference Rate is the percentage of difference between the synonym results of two models. Terms that differ are highlighted in red.

Note
  • You can compare the system's built-in model with different versions of the current model. After you click Compare, the synonym performance comparison is displayed.

  • The typical case comparison displays up to 200 cases where synonym results differ, including the original text and the results from both Model 1 and Model 2.

  • You can enter a test query in the test text box to check the corresponding synonym results.

Search test page

  1. In the OpenSearch console, navigate to OpenSearch Industry Algorithm Edition > Feature Extensions > Search Test to test the search performance.

On the Search Test page, enter the search query default:'medicine'. After synonym expansion, the actual search query is rewritten as (default:'medicine') OR (default:'drug') OR (default:'medicinal' AND default:'product') OR (default:'medical' AND default:'medicine'). The Synonym row in the Query Analysis panel on the right shows the expansion result as drug/medicinal product/medical medicine medicine.

  1. To view the compute cost for each search request, you can add the custom parameter fetch=result:compute_cost and enable Source Code Mode. On the Search Test page, set the query clause to default:'medicine' and add this parameter to retrieve the compute cost (LCU) details. The compute_cost field in the JSON response contains an algos array, which shows the cost for the synonym algorithm (function: "synonym"), such as value: 5, and the total LCU consumption of the index, such as value: 6.44.

  2. If you use two or more models for an index search, connect them with the OR operator.

An example query statement is zw_ds_model:'search' OR it_model:'black'. The returned compute_cost structure contains an algos array. Each element in the array shows the name of an analyzer and its compute cost value. The value for both zw_ds_model and it_model is 2.1, which represents the compute cost of each model.

Limitations

  • This feature is available only for a Dedicated Cluster Instance of Industry Algorithm Edition.

  • You can create up to five custom models per instance, and each model can have a maximum of three versions.

  • A tailored retrieval model created for a specific application cannot be used in other applications.