All Products
Search
Document Center

OpenSearch:Tailored retrieval analyzers

Last Updated:Mar 06, 2023

Overview

Word analysis is an important basic feature of search engines. Analysis effects directly affect search results. In various business scenarios, customers from different industries have special requirements. Only customized analysis that is specific to the application of each customer can meet the analysis requirements of each customer.

OpenSearch Industry Algorithm Edition provides tailored retrieval models to cater to the analysis requirements of different customers. OpenSearch Industry Algorithm Edition provides various industry-specific analyzers. You can select an industry-specific analyzer based on your business requirements to create and train a tailored retrieval model, and then create a custom analyzer based on the trained model. During the entire process, you are not required to import data to your application. When you train the tailored retrieval model, the system automatically extracts the existing data of your application and analyzes the data based on the business logic of your application.

The tailored retrieval models are charged based on storage capacity, computing resources, and model training. For more information, see the billing methods described in Overview.

Create a tailored retrieval model

Before you use a tailored retrieval model, perform the following steps:

  1. Create and train a model.

  2. Create a custom analyzer.

  3. Configure the custom analyzer.

Create and train a model

  1. In the left-side navigation pane, choose Search Algorithm Center > Retrieval Configuration. The Retrieval Configuration page appears. In the left-side pane, click Tailored Retrieval Models. On the page that appears, select an exclusive application for which you want to create a tailored retrieval model, and click Create.image

  2. Configure the Model Name, Model Type, Basic Analyzer, Training Fields, and Normalization parameters, and then click Submit.

image

You can select one of the following options from the Basic Analyzer drop-down list: Chinese - General Analyzer, Chinese - E-commerce Analyzer, IT - Content Analyzer, Industry - General Analyzer for Gaming, Industry - Analyzer for Educational Q&A Search, Industry - IT Content Analysis, and Industry - General Analyzer for E-commerce.

You can specify one or more of the following values for the Normalization parameter: Uppercase to Lowercase, Traditional to Simplified Chinese, and Full-width to Half-width Characters. This parameter is optional. Take note that this parameter takes effect only when you perform searches and does not affect the content of the original fields.

Important

  • You cannot change the model name after the model is created.

  • You can specify only fields of the SHORT TEXT or TEXT data type for the Training Fields parameter.

  1. After the model is created, it is in the Unavailable state by default. On the Tailored Retrieval Models page, click Train in the Actions column of the model that you created.

image
Note

  • In most cases, a model training requires one or two business days to complete.

  • You can repeatedly train a model. Each time a model training is complete, a model version is added in the Training History section on the details page of the model. The version numbers increase by increments of 1.

Create a custom analyzer

After the tailored retrieval model is trained and the model enters the Available state, you can create and configure a custom analyzer.

  1. In the left-side navigation pane, choose Search Algorithm Center > Retrieval Configuration. The Retrieval Configuration page appears. In the left-side pane, click Analyzer Management. On the page that appears, click the Text Analyzer tab. Then, click Create.

image
  1. In the panel that appears, enter a name, select Tailored Model Analyzer as the analyzer type, select your HA3 engine instance, and then select the tailored retrieval model that you created. Then, click Save.

image
  1. After you create a custom analyzer, you can use the custom analyzer to test whether word analysis is performed as expected and manage entries.

image

Configure the custom analyzer

After a custom analyzer is created, you can perform offline change operations to make the custom analyzer take effect on an index.

  1. In the left-side navigation pane, choose Instance Management > HA3 Engine. On the page that appears, find the application that you want to manage, and click Details in the Actions column. On the details page that appears, click Modify Offline Application.

image
  1. In the Index Schema step, find the corresponding index. In the Analysis Method column, specify the custom analyzer for which you configured a tailored retrieval model, and select the model version that you want to use.

image
  1. After you complete an offline change operation, wait until the index is rebuilt.

image
  1. After the index is rebuilt, you can test the effects of the tailored retrieval model on the Search Test page.

image

Details pages

Tailored Retrieval Models page

image

On the Tailored Retrieval Models page, you can view information about each tailored retrieval model, including the model name, model type, model status, start time of the last training, and status of the latest version. A tailored retrieval model can be in the Available or Unavailable state. You can also click Details, Train, or Delete in the Actions column of each tailored retrieval model to view the model details, train the model, or delete the model.

Note

  • Tailored retrieval models that are referenced by indexes cannot be deleted.

  • If the latest version of a tailored retrieval model is Training, the Retrain button in the Actions column is displayed in gray and is unavailable. If the latest version of a tailored retrieval model is not Training, you can click Retrain to retrain the model.

Details page of a tailored retrieval model

The details page of a tailored retrieval model consists of the following three sections:

Basic Information: In this section, you can only view the basic information about the tailored retrieval model, including the creation time, model status, start time of the last training, and status of the latest version.

Configuration Information: In this section, you can only view the configurations that you specified when you created or configured the tailored retrieval model. The configurations include the basic analyzer, the training fields, and the normalization settings.

Training History: In this section, you can view information such as the model version, configuration information, version status, start time of the training, end time of the training, and the index that references the tailored retrieval model. You can also test the effects of the model.

image

You can compare the effects of tailored retrieval models in typical cases and download the comparison results.

image

Limits

  • This feature is available only for Industry Algorithm Edition - Dedicated Cluster instances.

  • A maximum of five tailored retrieval models can be created in a single instance.

  • A tailored retrieval model is created for a specific application and cannot be configured across applications.

  • Only text analyzers can be customized.